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monitoring bits to CPUID instruction. Corrected exception tables 
for POPF, SFENCE, SUB, XL AT, IRET, LSL, MOV(CRn), 
SGDT/SIDT, SMSW, and STI instructions. Corrected many small 
typos and incorporated branding terminology. 
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About This Book 

This book is part of a multivolume work entitled the AMD64 Architecture Programmer s Manual. This 
table lists each volume and its order number. 


Title 

Order No. 

Volume 1: Application Programming 

24592 

Volume 2: System Programming 

24593 

Volume 3: General-Purpose and System Instructions 

24594 

Volume 4: 128-Bit and 256-Bit Media Instructions 

26568 

Volume 5: 64-Bit Media and x87 Floating-Point Instructions 

26569 


Audience 

This volume (Volume 3) is intended for all programmers writing application or system software for a 
processor that implements the AMD64 architecture. Descriptions of general-purpose instructions 
assume an understanding of the application-level programming topics described in Volume 1. 
Descriptions of system instructions assume an understanding of the system-level programming topics 
described in Volume 2. 

Organization 

Volumes 3, 4, and 5 describe the AMD64 architecture’s instruction set in detail. Together, they cover 
each instruction’s mnemonic syntax, opcodes, functions, affected flags, and possible exceptions. 

The AMD64 instruction set is divided into five subsets: 

• General-purpose instructions 

• System instructions 

• Streaming SIMD Extensions-SSE (includes 128-bit and 256-bit media instructions) 

• 64-bit media instructions (MMX™) 

• x87 floating-point instructions 

Several instructions belong to—and are described identically in—multiple instruction subsets. 

This volume describes the general-purpose and system instructions. The index at the end cross- 
references topics within this volume. For other topics relating to the AMD64 architecture, and for 
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information on instructions in other subsets, see the tables of contents and indexes of the other 
volumes. 

Conventions and Definitions 

The following section Notational Conventions describes notational conventions used in this volume 
and in the remaining volumes of this AMD64 Architecture Programmer s Manual. This is followed 
by a Definitions section which lists a number of terms used in the manual along with their technical 
definitions. Finally, the Registers section lists the registers which are a part of the application 
programming model. 

Notational Conventions 

#GP(0) 

An instruction exception—in this example, a general-protection exception with error code of 0. 
1011b 

A binary value—in this example, a 4-bit value. 

F0EA_0B02h 

A hexadecimal value. Underscore characters may be inserted to improve readability. 


128 

Numbers without an alpha suffix are decimal unless the context indicates otherwise. 


7:4 

A bit range, from bit 7 to 4, inclusive. The high-order bit is shown first. Commas may be inserted 
to indicate gaps. 

CPUID Fn XXXX_XXXX_RRR[FieldName] 

Support for optional features or the value of an implementation-specific parameter of a processor 
can be discovered by executing the CPUID instruction on that processor. To obtain this value, 
software must execute the CPUID instruction with the function code XXXX_XXXXh in EAX and 
then examine the field FieldName returned in register RRR. If the “_RRR” notation is followed by 
“_x YYY\ register ECX must be set to the value YYYh before executing CPUID. When FieldName 
is not given, the entire contents of register RRR contains the desired value. When determining 
optional feature support, if the bit identified by FieldName is set to a one, the feature is supported 
on that processor. 

CR0-CR4 

A register range, from register CRO through CR4, inclusive, with the low-order register first. 
CR0[PE], CRO.PE 

Notation for referring to a field within a register—in this case, the PE field of the CRO register. 
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CRO[PE] = 1, CRO.PE = 1 

Notation indicating that the PE bit of the CRO register has a value of 1. 

DS:rSI 

The contents of a memory location whose segment address is in the DS register and whose offset 
relative to that segment is in the rSI register. 

EFER[LME] = 0, EFER.LME = 0 

Notation indicating that the LME bit of the EFER register has a value of 0. 

RFLAGS[13:12] 

A field within a register identified by its bit range. In this example, corresponding to the IOPL 
field. 

Definitions 

Many of the following definitions assume an in-depth knowledge of the legacy x86 architecture. See 
“Related Documents” on page xxxiii for descriptions of the legacy x86 architecture. 

128-bit media instructions 

Instructions that operate on the various 128-bit vector data types. Supported within both the legacy 
SSE and extended SSE instruction sets. 

256-bit media instructions 

Instructions that operate on the various 256-bit vector data types. Supported within the extended 
SSE instruction set. 

64-bit media instructions 

Instructions that operate on the 64-bit vector data types. These are primarily a combination of 
MMX™ and 3DNow!™ instruction sets, with some additional instructions from the SSE1 and 
SSE2 instruction sets. 

16-bit mode 

Legacy mode or compatibility mode in which a 16-bit address size is active. See legacy mode and 
compatibility mode. 

32-bit mode 

Legacy mode or compatibility mode in which a 32-bit address size is active. See legacy mode and 
compatibility mode. 

64-bit mode 

A submode of long mode. In 64-bit mode, the default address size is 64 bits and new features, such 
as register extensions, are supported for system and application software. 
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absolute 

Said of a displacement that references the base of a code segment rather than an instruction pointer. 
Contrast with relative. 

biased exponent 

The sum of a floating-point value’s exponent and a constant bias for a particular floating-point data 
type. The bias makes the range of the biased exponent always positive, which allows reciprocation 
without overflow. 

byte 

Eight bits, 
clear 

To write a bit value of 0. Compare set. 
compatibility mode 

A submode of long mode. In compatibility mode, the default address size is 32 bits, and legacy 16- 
bit and 32-bit applications run without modification. 

commit 

To irreversibly write, in program order, an instruction’s result to software-visible storage, such as a 
register (including flags), the data cache, an internal write buffer, or memory. 

CPL 

Current privilege level, 
direct 

Referencing a memory location whose address is included in the instruction’s syntax as an 
immediate operand. The address may be an absolute or relative address. Compare indirect. 

dirty data 

Data held in the processor’s caches or internal buffers that is more recent than the copy held in 
main memory. 

displacement 

A signed value that is added to the base of a segment (absolute addressing) or an instruction pointer 
(relative addressing). Same as offset. 

doubleword 

Two words, or four bytes, or 32 bits, 
double quadword 

Eight words, or 16 bytes, or 128 bits. Also called octword. 
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effective address size 

The address size for the current instruction after accounting for the default address size and any 
address-size override prefix. 

effective operand size 

The operand size for the current instruction after accounting for the default operand size and any 
operand-size override prefix. 

element 

See vector. 

exception 

An abnormal condition that occurs as the result of executing an instruction. The processor’s 
response to an exception depends on the type of the exception. For all exceptions except 128-bit 
media SIMD floating-point exceptions and x87 floating-point exceptions, control is transferred to 
the handler (or service routine) for that exception, as defined by the exception’s vector. For 
floating-point exceptions defined by the IEEE 754 standard, there are both masked and unmasked 
responses. When unmasked, the exception handler is called, and when masked, a default response 
is provided instead of calling the handler. 

flush 

An often ambiguous term meaning (1) writeback, if modified, and invalidate, as in “flush the cache 
line,” or (2) invalidate, as in “flush the pipeline,” or (3) change a value, as in “flush to zero.” 

GDT 

Global descriptor table. 

IDT 

Interrupt descriptor table. 

IGN 

Ignored. Value written is ignored by hardware. Value returned on a read is indeterminate. See 
reserved. 

indirect 

Referencing a memory location whose address is in a register or other memory location. The 
address may be an absolute or relative address. Compare direct. 

IRB 

The virtual-8086 mode interrupt-redirection bitmap. 

1ST 

The long-mode interrupt-stack table. 
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IVT 

The real-address mode interrupt-vector table. 

LDT 

Local descriptor table, 
legacy x86 

The legacy x86 architecture. See “Related Documents” on page xxxiii for descriptions of the 
legacy x86 architecture. 

legacy mode 

An operating mode of the AMD64 architecture in which existing 16-bit and 32-bit applications and 
operating systems run without modification. A processor implementation of the AMD64 
architecture can run in either long mode or legacy mode. Legacy mode has three submodes, real 
mode, protected mode, and virtual-8086 mode. 

long mode 

An operating mode unique to the AMD64 architecture. A processor implementation of the 
AMD64 architecture can run in either long mode or legacy mode. Long mode has two submodes, 
64-bit mode and compatibility mode. 


lsb 

Least-significant bit. 

LSB 

Least-significant byte, 
main memory 

Physical memory, such as RAM and ROM (but not cache memory) that is installed in a particular 
computer system. 

mask 

(1) A control bit that prevents the occurrence of a floating-point exception from invoking an 
exception-handling routine. (2) A field of bits used for a control purpose. 

MBZ 

Must be zero. If software attempts to set an MBZ bit to 1, a general-protection exception (#GP) 
occurs. 

memory 

Unless otherwise specified, main memory. 

ModRM 

A byte following an instruction opcode that specifies address calculation based on mode (Mod), 
register (R), and memory (M) variables. 
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mo ft set 

A 16, 32, or 64-bit offset that specifies a memory operand directly, without using a ModRM or SIB 
byte. 

msb 

Most-significant bit. 

MSB 

Most-significant byte, 
multimedia instructions 

A combination of 128-bit media instructions and 64-bit media instructions. 
octword 

Same as double quadword. 
offset 

Same as displacement. 
overflow 

The condition in which a floating-point number is larger in magnitude than the largest, finite, 
positive or negative number that can be represented in the data-type format being used. 

packed 

See vector. 

PAE 

Physical-address extensions, 
physical memory 

Actual memory, consisting of main memory and cache, 
probe 

A check for an address in a processor’s caches or internal buffers. External probes originate 
outside the processor, and in ternal probes originate within the processor. 

protected mode 

A submode of legacy mode. 

quadword 

Four words, or eight bytes, or 64 bits. 

RAZ 

Read as zero. Value returned on a read is always zero (0) regardless of what was previously 
written. See reserved. 
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real-address mode 
See real mode. 

real mode 

A short name for real-address mode, a submode of legacy mode. 
relative 

Referencing with a displacement (also called offset) from an instruction pointer rather than the 
base of a code segment. Contrast with absolute. 

reserved 

Fields marked as reserved may be used at some future time. 

To preserve compatibility with future processors, reserved fields require special handling when 
read or written by software. Software must not depend on the state of a reserved field (unless 
qualified as RAZ), nor upon the ability of such fields to return a previously written state. 

If a field is marked reserved without qualification, software must not change the state of that field; 
it must reload that field with the same value returned from a prior read. 

Reserved fields may be qualified as IGN, MBZ, RAZ, or SBZ (see definitions). 

REX 

An instruction prefix that specifies a 64-bit operand size and provides access to additional 
registers. 

RIP-relative addressing 

Addressing relative to the 64-bit RIP instruction pointer. 

SBZ 

Should be zero. An attempt by software to set an SBZ bit to 1 results in undefined behavior. 


set 

To write a bit value of 1. Compare clear. 


SIB 

A byte following an instruction opcode that specifies address calculation based on scale (S), index 
(I), and base (B). 

SIMD 

Single instruction, multiple data. See vector. 

SSE 

Streaming SIMD extensions instruction set. See 128-bit media instructions and 64-bit media 
instructions. 
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SSE2 

Extensions to the SSE instruction set. See 128-bit media instructions and 64-bit media 
instructions. 

SSE3 

Further extensions to the SSE instruction set. See 128-bit media instructions. 
sticky bit 

A bit that is set or cleared by hardware and that remains in that state until explicitly changed by 
software. 

TOP 

The x87 top-of-stack pointer. 

TPR 

Task-priority register (CR8). 

TSS 

Task-state segment, 
underflow 

The condition in which a floating-point number is smaller in magnitude than the smallest nonzero, 
positive or negative number that can be represented in the data-type format being used. 

vector 

(1) A set of integer or floating-point values, called elements, that are packed into a single operand. 
Most of the 128-bit and 64-bit media instructions use vectors as operands. Vectors are also called 
packed or SIMD (single-instruction multiple-data) operands. 

(2) An index into an interrupt descriptor table (IDT), used to access exception handlers. Compare 
exception. 

virtual-8086 mode 

A submode of legacy mode. 

word 

Two bytes, or 16 bits. 

x86 

See legacy x86. 

Registers 

In the following list of registers, the names are used to refer either to a given register or to the contents 
of that register: 
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AH-DH 

The high 8-bit AH, BH, CH, and DH registers. Compare AL-DL. 

AL-DL 

The low 8-bit AL, BL, CL, and DL registers. Compar qAH-DH. 

AL-rl5B 

The low 8-bit AL, BL, CL, DL, SIL, DIL, BPL, SPL, and R8B-R15B registers, available in 64-bit 
mode. 


BP 

Base pointer register. 

CR/2 

Control register number n. 


Code segment register. 
eAX-eSP 

The 16-bit AX, BX, CX, DX, DI, SI, BP, and SP registers or the 32-bit EAX, EBX, ECX, EDX, 
EDI, ESI, EBP, and ESP registers. Compare rAX—rSP. 

EFER 

Extended features enable register. 
eFLAGS 

16-bit or 32-bit flags register. Compare rFLAGS. 

EFLAGS 

32-bit (extended) flags register. 


elP 

16-bit or 32-bit instruction-pointer register. Compare rIP. 
EIP 

32-bit (extended) instruction-pointer register. 

FLAGS 

16-bit flags register. 

GDTR 

Global descriptor table register. 
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GPRs 

General-purpose registers. For the 16-bit data size, these are AX, BX, CX, DX, DI, SI, BP, and SP 
For the 32-bit data size, these are EAX, EBX, ECX, EDX, EDI, ESI, EBP, and ESP. For the 64-bit 
data size, these include RAX, RBX, RCX, RDX, RDI, RSI, RBP, RSP, and R8-R15. 


IDTR 

Interrupt descriptor table register. 


IP 

16-bit instruction-pointer register. 

LDTR 

Local descriptor table register. 

MSR 

Model-specific register. 
r8-rl5 

The 8-bit R8B-R15B registers, or the 16-bit R8W-R15W registers, or the 32-bit R8D-R15D 
registers, or the 64-bit R8-R15 registers. 

rAX-rSP 

The 16-bit AX, BX, CX, DX, DI, SI, BP, and SP registers, or the 32-bit EAX, EBX, ECX, EDX, 
EDI, ESI, EBP, and ESP registers, or the 64-bit RAX, RBX, RCX, RDX, RDI, RSI, RBP, and RSP 
registers. Replace the placeholder r with nothing for 16-bit size, “E” for 32-bit size, or “R” for 64- 
bit size. 

RAX 

64-bit version of the EAX register. 

RBP 

64-bit version of the EBP register. 

RBX 

64-bit version of the EBX register. 

RCX 

64-bit version of the ECX register. 

RDI 

64-bit version of the EDI register. 

RDX 

64-bit version of the EDX register. 
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rFLAGS 

16-bit, 32-bit, or 64-bit flags register. Compare RFLAGS. 

RFLAGS 

64-bit flags register. Compare rFLAGS. 
rIP 

16-bit, 32-bit, or 64-bit instruction-pointer register. Compare RIP. 

RIP 

64-bit instruction-pointer register. 

RSI 

64-bit version of the ESI register. 

RSP 

64-bit version of the ESP register. 

SP 

Stack pointer register. 

SS 

Stack segment register. 

TPR 

Task priority register, a new register introduced in the AMD64 architecture to speed interrupt 
management. 


TR 

Task register. 

Endian Order 

The x86 and AMD64 architectures address memory using little-endian byte-ordering. Multibyte 
values are stored with their least-significant byte at the lowest byte address, and they are illustrated 
with their least significant byte at the right side. Strings are illustrated in reverse order, because the 
addresses of their bytes increase from right to left. 
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1 Instruction Encoding 


AMD64 technology instructions are encoded as byte strings of variable length. The order and meaning 
of each byte of an instruction’s encoding is specified by the architecture. Fields within the encoding 
specify the instruction’s basic operation, the location of the one or more source operands, and the 
destination of the result of the operation. Data to be used in the execution of the instruction or the 
computation of addresses for memory-based operands may also be included. This section describes the 
general format and parameters used by all instructions. 

For information on the specific encoding(s) for each instruction, see: 

• Chapter 3, “General-Purpose Instruction Reference.” 

• Chapter 4, “System Instruction Reference.” 

• “SSE Instruction Reference” in Volume 4. 

• “64-Bit Media Instruction Reference” in Volume 5. 

• “x87 Floating-Point Instruction Reference” in Volume 5. 

For information on determining the instruction form and operands specified by a given binary 
encoding, see Appendix A. 

1.1 Instruction Encoding Overview 

An instruction is encoded as a string between one and 15 bytes in length. The entire sequence of bytes 
that represents an instruction, including the basic operation, the location of source and destination 
operands, any operation modifiers, and any immediate and/or displacement values, is called the 
instruction encoding.The following sections discuss instruction encoding syntax and representation in 
memory. 

1.1.1 Encoding Syntax 

Figure 1-1 provides a schematic representation of the encoding syntax of an instruction. 
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Figure 1-1. Instruction Encoding Syntax 

Each square in this diagram represents an instruction byte of a particular type and function. To 
understand the diagram, follow the connecting paths in the direction indicated by the arrows from 
“Start” to “End.” The squares passed through as the graph is traversed indicate the order and number of 
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bytes used to encode the instruction. Note that the path shown above the legacy prefix byte loops back 
indicating that up to four additional prefix bytes may be used in the encoding of a single instruction. 
Branches indicate points in the syntax where alternate semantics are employed based on the instruction 
being encoded. The “VEX or XOP” gate across the path leading down to the VEX prefix and XOP 
prefix blocks means that only extended instructions employing the VEX or XOP prefixes use this 
particular branch of the syntax diagram. This diagram will be further explained in the sections that 
follow. 

1.1.1.1 Legacy Prefixes 

As shown in the figure, an instruction optionally begins with up to five legacy prefixes. These prefixes 
are described in “Summary of Legacy Prefixes” on page 6. The legacy prefixes modify an instruction’s 
default address size, operand size, or segment, or they invoke a special function such as modification 
of the opcode, atomic bus-locking, or repetition. 

In the encoding of most SSE instructions, a legacy operand-size or repeat prefix is repurposed to 
modify the opcode. For the extended encodings utilizing the XOP or VEX prefixes, these prefixes are 
not allowed. 

1.1.1.2 REX Prefix 

Following the optional legacy prefix or prefixes, the REX prefix can be used in 64-bit mode to access 
the AMD64 register number and size extensions. Refer to the diagram in “Application-Programming 
Register Set” in Volume 1 for an illustration of these facilities. If a REX prefix is used, it must 
immediately precede the opcode byte or the first byte of a legacy escape sequence. The REX prefix is 
not allowed in extended instruction encodings using the VEX or XOP encoding escape prefixes. 
Violating this restriction results in an #UD exception. 

1.1.1.3 Opcode 

The opcode is a single byte that specifies the basic operation of an instruction. Every instruction 
requires an opcode. The correspondence between the binary value of an opcode and the operation it 
represents is presented in a table called an opcode map. Because it is indexed by an 8-bit value, an 
opcode map has 256 entries. Since there are more than 256 instructions defined by the architecture, 
multiple different opcode maps must be defined and the selection of these alternate opcode maps must 
be encoded in the instruction. Escape sequences provide this access to alternate opcode maps. 

If there are no opcode escapes, the primary (“one-byte”) opcode map is used. In the figure this is the 
path pointing from the REX Prefix block to the Primary opcode map block. 

Section , “Primary Opcode Map” of Appendix A provides details concerning this opcode map. 

1.1.1.4 Escape Sequences 

Escape sequences allow access to alternate opcode maps that are distinct from the primary opcode 
map. Escape sequences may be one, two, or three bytes in length and begin with a unique byte value 
designated for this purpose in the primary opcode map. Escape sequences are of two distinct types: 
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legacy escape sequences and extended escape sequences. The legacy escape sequences will be covered 
here. For more details on the extended escape sequences, see “VEX and XOP Prefixes” on page 16. 

Legacy Escape Sequences 

The legacy syntax allows one 1-byte escape sequence (OFh), and three 2-byte escape sequences (OFh, 
OFh; OFh, 38h; and OFh, 3Ah). The 1-byte legacy escape sequence OFh selects the secondary (“two- 
byte”) opcode map. In legacy terminology, the sequence [OFh, opcode ] is called a two-byte opcode. 
See Section , “Secondary Opcode Map” of Appendix A for details concerning this opcode map. 

The 2-byte escape sequence OF, OFh selects the 3DNow! opcode map which is indexed using an 
immediate byte rather than an opcode byte. In this case, the byte following the escape sequence is the 
ModRM byte instead of the opcode byte. In Figure 1-1 this is indicated by the path labeled “3DNow!” 
leaving the second OFh escape block. Details concerning the 3DNow! opcode map are presented in 
Section A. 1.2, “3DNow!™ Opcodes” of Appendix A. 

The 2-byte escape sequences [OFh, 38h] and [OFh, 3Ah] respectively select the 0F_38h opcode map 
and the 0F_3Ah opcode map. These are used primarily to encode SSE instructions and are described in 
Section , “0F_38h and 0F_3Ah Opcode Maps” of Appendix A. 

1.1.1.5 ModRM and SIB Bytes 

The opcode can be followed by a mo de-register-memory (ModRM) byte, which further describes the 
operation and/or operands. The ModRM byte may also be followed by a scale-index-base (SIB) byte, 
which is used to specify indexed register-indirect forms of memory addressing. The ModRM and SIB 
bytes are described in “ModRM and SIB Bytes” on page 17. Their legacy functions can be augmented 
by the REX prefix (see “REX Prefix” on page 14) or the VEX and XOP escape sequences (See “VEX 
and XOP Prefixes” on page 16). 

1.1.1.6 Displacement and Immediate Fields 

The instruction encoding may end with a 1-, 2-, or 4-byte displacement field and/or a 1-, 2-, or 4-byte 
immediate field depending on the instruction and/or the addressing mode. Specific instructions also 
allow either an 8-byte immediate field or an 8-byte displacement field. 

1.1.2 Representation in Memory 

Instructions are stored in memory in little-endian order. The first byte of an instruction is stored at the 
lowest memory address, as shown in Figure 1-2 below. Since instructions are strings of bytes, they 
may start at any memory address. The total instruction length must be less than or equal to 15. If this 
limit is exceeded, a general-protection exception results. 
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Legacy encoding including 
optional REX Prefix 


Extended encoding 
using VEX/XOP 2 


15 Bytes 


Highest 

Address 

Immediate 



Immediate 

Immediate 



Immediate 


Immediate 



Immediate 


Immediate 



Immediate 


Displacement 



Displacement 


Displacement 


- t 

Displacement 


DisplacerrTgfvT" 


DisplacerrTgfvr" 


Displacement 



Displacement 


SI Bt 


SI Bt 


ModRM* 


ModRM* 


Opcode 


Opcode 


Escape* 


W.vvvv.L.pp 


Escape* 


RXB.map select 


REX 1 


VEX/XOP 


Legacy Prefix 



Legacy Prefix 3 


Legacy Prefix 


► < 5 

t 

Legacy Prefix 3 

Lowest 

Address 

Legacy Prefix 


Legacy Prefix 3 

Legacy Prefix 



Legacy Prefix 3 


} *1,2,4, or 8 

see note 4 

> tl ,2,4, or 8 

t optional, based addressing mode 
* optional, based on instruction 

R.vvvv.L.pp for VEX C5 
not present for VEX C5 


} <4 

t optional, with most instructions 


Notes: 

1 Available only in 64-Bit Mode 

2 Available only in Long or Protected Mode 

3 FO, F2, F3, and 66 prefixes not allowed 

4 Instructions that specify an 8-byte immediate field do 

not include a displacement field and vice versa. v3_instruct_mem.eps 

Figure 1-2. An Instruction as Stored in Memory 


1.2 Instruction Prefixes 

Instruction prefixes are of two types: instruction modifier prefixes and encoding escape prefixes. 
Instruction modifier prefixes can change the operation of the instruction (including causing its 
execution to repeat), change its operand types, specify an alternate operand size, augment register 
specification, or even change the interpretation of the opcode byte. 

The instruction modifier prefixes comprise the legacy prefixes and the REX prefix. The legacy 
prefixes are discussed in the next section. The REX prefix is discussed in “REX Prefix” on page 14. 

Encoding escape prefixes, on the other hand, signal that the two or three bytes that follow obey a 
different encoding syntax. As a group, the encoding escape prefix and its subsequent bytes constitute a 
multi-byte escape sequence. These multi-byte escape sequences perform functions similar to that of 
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the instruction modifier prefixes, but they also provide a means to directly specify alternate opcode 
maps. 

The currently defined encoding escape prefixes are the VEX and XOP prefixes. They are discussed 
further in the section entitled “VEX and XOP Prefixes” on page 16. 

1.2.1 Summary of Legacy Prefixes 

Table 1-1 on page 7 shows the legacy prefixes. The legacy prefixes are organized into five groups, as 
shown in the left-most column of Table 1-1. An instruction encoding may include a maximum of one 
prefix from each of the five groups. The legacy prefixes can appear in any order within the position 
shown in Figure 1-1 for legacy prefixes. The result of using multiple prefixes from a single group is 
undefined. 

Some of the restrictions on legacy prefixes are: 

• Operand-Size Override —This prefix only affects the operand size for general-purpose instructions 
or for other instructions whose source or destination is a general-pupose register. When used in the 
encoding of SIMD and some other instructions, this prefix is repurposed to modify the opcode. 

• Address-Size Override —This prefix only affects the address size of memory operands. 

• Segment Override —In 64-bit mode, the CS, DS, ES, and SS segment override prefixes are 
ignored. 

• LOCK Prefix —This prefix is allowed only with certain instructions that modify memory. 

• Repeat Prefixes —These prefixes affect only certain string instructions. When used in the encoding 
of SIMD and some other instructions, these prefixes are repurposed to modify the opcode. 
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Table 1-1. Legacy Instruction Prefixes 


Prefix Group 1 

Mnemonic 

Prefix 
Byte (Hex) 

Description 

Operand-Size 

Override 

none 

CM 

CO 

CO 

Changes the default operand size of a memory or 
register operand, as shown in Table 1-2 on page 8. 

Address-Size Override 

none 

67 3 

Changes the default address size of a memory operand, 
as shown in Table 1-3 on page 9. 

Segment Override 

CS 

"3- 

LU 

CM 

Forces use of the current CS segment for memory 
operands. 

DS 

LLI 

CO 

Forces use of the current DS segment for memory 
operands. 

ES 

■'t 

CO 

CM 

Forces use of the current ES segment for memory 
operands. 

FS 

64 

Forces use of the current FS segment for memory 
operands. 

GS 

65 

Forces use of the current GS segment for memory 
operands. 

SS 

"3- 

co 

CO 

Forces use of the current SS segment for memory 
operands. 

Lock 

LOCK 

FO 5 

Causes certain kinds of memory read-modify-write 
instructions to occur atomically. 

Repeat 

REP 

F3 6 

Repeats a string operation (INS, MOVS, OUTS, LODS, 
and STOS) until the rCX register equals 0. 

REPE or 
REPZ 

Repeats a compare-string or scan-string operation 
(CMPSx and SCASx) until the rCX register equals 0 or 
the zero flag (ZF) is cleared to 0. 

REPNE or 
REPNZ 

F2 6 

Repeats a compare-string or scan-string operation 
(CMPSx and SCASx) until the rCX register equals 0 or 
the zero flag (ZF) is set to 1. 


Notes: 


1. A single instruction should include a maximum of one prefix from each of the five groups. 

2. When used in the encoding of SIMD instructions, this prefix is repurposed to modify the opcode. The prefix is 
ignored by 64-bit media floating-point (3DNow!™) instructions. See “Instructions that Cannot Use the Operand-Size 
Prefix” on page 8. 

3. This prefix also changes the size of the RCX register when used as an implied count register. 

4. In 64-bit mode , the CS , DS, ES, and SS segment overrides are ignored. 

5. The LOCK prefix should not be used for instructions other than those listed in “Lock Prefix” on page 11. 

6. This prefix should be used only with compare-string and scan-string instructions. When used in the encoding of 
SIMD instructions , the prefix is repurposed to modify the opcode. 


1.2.2 Operand-Size Override Prefix 

The default operand size for an instruction is determined by a combination of its opcode, the D 
(default) bit in the current code-segment descriptor, and the current operating mode, as shown in 
Table 1-2. The operand-size override prefix (66h) selects the non-default operand size. The prefix can 
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be used with any general-purpose instruction that accesses non-fixed-size operands in memory or 
general-purpose registers (GPRs), and it can also be used with the x87 FLDENV, FNSTENV, 
FNSAVE, and FRSTOR instructions. 

In 64-bit mode, the prefix allows mixing of 16-bit, 32-bit, and 64-bit data on an instruction-by¬ 
instruction basis. In compatibility and legacy modes, the prefix allows mixing of 16-bit and 32-bit 
operands on an instruction-by-instruction basis. 


Table 1-2. Operand-Size Overrides 


Operating Mode 

Default 
Operand 
Size (Bits) 

Effective 

Operand 

Size 

(Bits) 

Instruction Prefix 1 

66h 

REX.W 3 

Long 

Mode 

64-Bit 

Mode 

32 2 

64 

don’t care 

yes 

32 

no 

no 

16 

yes 

no 

Compatibility 

Mode 

32 

32 

no 

Not Appli¬ 
cable 

16 

yes 

16 

32 

yes 

16 

no 

Legacy Mode 
(Protected, Virtual-8086, 
or Real Mode) 

32 

32 

no 

16 

yes 

16 

32 

yes 

16 

no 

Notes: 

1. A “no’ indicates that the default operand size is used. 

2. This is the typical default, although some instructions default to other operand 
sizes. See Appendix B, “General-Purpose Instructions in 64-Bit Mode, ” for details. 

3. See “REX Prefix” on page 14. 


In 64-bit mode, most instructions default to a 32-bit operand size. For these instructions, a REX prefix 
(page 14) can specify a 64-bit operand size, and a 66h prefix specifies a 16-bit operand size. The REX 
prefix takes precedence over the 66h prefix. However, if an instruction defaults to a 64-bit operand 
size, it does not need a REX prefix and it can only be overridden to a 16-bit operand size. It cannot be 
overridden to a 32-bit operand size, because there is no 32-bit operand-size override prefix in 64-bit 
mode. Two groups of instructions have a default 64-bit operand size in 64-bit mode: 

• Near branches. For details, see “Near Branches in 64-Bit Mode” in Volume 1. 

• All instructions, except far branches, that implicitly reference the RSR For details, see “Stack 
Operation” in Volume 1. 

Instructions that Cannot Use the Operand-Size Prefix. The operand-size prefix should be used 
only with general-purpose instructions and the x87 FLDENV, FNSTENV, FNSAVE, and FRSTOR 
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instructions, in which the prefix selects between 16-bit and 32-bit operand size. The prefix is ignored 
by all other x87 instructions and by 64-bit media floating-point (3DNow!™) instructions. 

For other instructions (mostly SIMD instructions) the 66h, F2h, and F3h prefixes are used as 
instruction modifiers to extend the instruction encoding space in the OFh, 0F_38h, and 0F_3Ah opcode 
maps. 

Operand-Size and REX Prefixes. The W bit field of the REX prefix takes precedence over the 66h 
prefix. See “REX.W: Operand width (Bit 3)” on page 23 for details. 

1.2.3 Address-Size Override Prefix 

The default address size for instructions that access non-stack memory is determined by the current 
operating mode, as shown in Table 1-3. The address-size override prefix (67h) selects the non-default 
address size. Depending on the operating mode, this prefix allows mixing of 16-bit and 32-bit, or of 
32-bit and 64-bit addresses, on an instruction-by-instruction basis. The prefix changes the address size 
for memory operands. It also changes the size of the RCX register for instructions that use RCX 
implicitly. 

For instructions that implicitly access the stack segment (SS), the address size for stack accesses is 
determined by the D (default) bit in the stack-segment descriptor. In 64-bit mode, the D bit is ignored, 
and all stack references have a 64-bit address size. However, if an instruction accesses both stack and 
non-stack memory, the address size of the non-stack access is detennined as shown in Table 1-3. 


Table 1-3. Address-Size Overrides 


Operating Mode 

Default 
Address 
Size (Bits) 

Effective 
Address Size 
(Bits) 

Address- 
Size Prefix 
(67b) 1 
Required? 

Long Mode 

64-Bit 

Mode 

64 

64 

no 

32 

yes 

Compatibility 

Mode 

32 

32 

no 

16 

yes 

16 

32 

yes 

16 

no 

Legacy Mode 

(Protected, Virtual-8086, or Real 
Mode) 

32 

32 

no 

16 

yes 

16 

32 

yes 

16 

no 


Notes: 

1. A “no” indicates that the default address size is used. 


As Table 1-3 shows, the default address size is 64 bits in 64-bit mode. The size can be overridden to 32 
bits, but 16-bit addresses are not supported in 64-bit mode. In compatibility and legacy modes, the 
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default address size is 16 bits or 32 bits, depending on the operating mode (see “Processor 
Initialization and Long Mode Activation” in Volume 2 for details). In these modes, the address-size 
prefix selects the non-default size, but the 64-bit address size is not available. 

Certain instructions reference pointer registers or count registers implicitly, rather than explicitly. In 
such instructions, the address-size prefix affects the size of such addressing and count registers, just as 
it does when such registers are explicitly referenced. Table 1-4 lists all such instructions and the 
registers referenced using the three possible address sizes. 

Table 1-4. Pointer and Count Registers and the Address-Size Prefix 


Instruction 

Pointer or Count Register 

16-Bit 

Address Size 

32-Bit 

Address Size 

64-Bit 

Address Size 

CMPS, CMPSB, CMPSW, 
CMPSD, CMPSQ— Compare 
Strings 

SI, Dl, CX 

ESI, EDI, ECX 

RSI, RDI, RCX 

INS, INSB, INSW, INSD— 

Input String 

Dl, CX 

EDI, ECX 

RDI, RCX 

JCXZ, JECXZ, JRCXZ— 

Jump on CX/ECX/RCX Zero 

CX 

ECX 

RCX 

LODS, LODSB, LODSW, 
LODSD, LODSQ— Load 

String 

SI, CX 

ESI, ECX 

RSI, RCX 

LOOP, LOOPE, LOOPNZ, 
LOOPNE, LOOPZ— Loop 

CX 

ECX 

RCX 

MOVS, MOVSB, MOVSW, 
MOVSD, MOVSQ— Move 

String 

SI, Dl, CX 

ESI, EDI, ECX 

RSI, RDI, RCX 

OUTS, OUTSB, OUTSW, 
OUTSD— Output String 

SI, CX 

ESI, ECX 

RSI, RCX 

REP, REPE, REPNE, REPNZ, 
REPZ— Repeat Prefixes 

CX 

ECX 

RCX 

SCAS, SCASB, SCASW, 
SCASD, SCASQ— Scan 

String 

Dl, CX 

EDI, ECX 

RDI, RCX 

STOS, STOSB, STOSW, 
STOSD, STOSQ— Store 

String 

Dl, CX 

EDI, ECX 

RDI, RCX 

XLAT, XLATB— Table Look-up 
Translation 

BX 

EBX 

RBX 


1.2.4 Segment-Override Prefixes 

Segment overrides can be used only with instructions that reference non-stack memory. Most 
instructions that reference memory are encoded with a ModRM byte (page 17). The default segment 
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for such memory-referencing instructions is implied by the base register indicated in its ModRM byte, 
as follows: 

• Instructions that Reference a Non-Stack Segment —If an instruction encoding references any base 
register other than rBP or rSP, or if an instruction contains an immediate offset, the default segment 
is the data segment (DS). These instructions can use the segment-override prefix to select one of 
the non-default segments, as shown in Table 1-5. 

• String Instructions —String instructions reference two memory operands. By default, they 
reference both the DS and ES segments (DS:rSI and ES:rDI). These instructions can override their 
DS-segment reference, as shown in Table 1-5, but they cannot override their ES-segment 
reference. 

• Instructions that Reference the Stack Segment —If an instruction’s encoding references the rBP or 
rSP base register, the default segment is the stack segment (SS). All instructions that reference the 
stack (push, pop, call, interrupt, return from interrupt) use SS by default. These instructions cannot 
use the segment-override prefix. 


Table 1-5. Segment-Override Prefixes 


Mnemonic 

Prefix Byte 
(Hex) 

Description 

CS 1 

2E 

Forces use of current CS segment for memory operands. 

DS 1 

3E 

Forces use of current DS segment for memory operands. 

ES' 1 

26 

Forces use of current ES segment for memory operands. 

FS 

64 

Forces use of current FS segment for memory operands. 

GS 

65 

Forces use of current GS segment for memory operands. 

SS 1 

36 

Forces use of current SS segment for memory operands. 

Notes: 

1. In 64-bit mode, the CS, DS, ES, and SS segment overrides are ignored. 


Segment Overrides in 64-Bit Mode. In 64-bit mode, the CS, DS, ES, and SS segment-override 
prefixes have no effect. These four prefixes are not treated as segment-override prefixes for the 
purposes of multiple-prefix rules. Instead, they are treated as null prefixes. 

The FS and GS segment-override prefixes are treated as true segment-override prefixes in 64-bit 
mode. Use of the FS or GS prefix causes their respective segment bases to be added to the effective 
address calculation. See “FS and GS Registers in 64-Bit Mode” in Volume 2 for details. 

1.2.5 Lock Prefix 

The LOCK prefix causes certain kinds of memory read-modify-write instructions to occur atomically. 
The mechanism for doing so is implementation-dependent (for example, the mechanism may involve 
bus signaling or packet messaging between the processor and a memory controller). The prefix is 
intended to give the processor exclusive use of shared memory in a multiprocessor system. 
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The LOCK prefix can only be used with forms of the following instructions that write a memory 
operand: ADC, ADD, AND, BTC, BTR, BTS, CMPXCHG, CMPXCHG8B, CMPXCHG16B, DEC, 
INC, NEG, NOT, OR, SBB, SUB, XADD, XCHG, and XOR. An invalid-opcode exception occurs if 
the LOCK prefix is used with any other instruction. 

1.2.6 Repeat Prefixes 

The repeat prefixes cause repetition of certain instructions that load, store, move, input, or output 
strings. The prefixes should only be used with such string instructions. Two pairs of repeat prefixes, 
REPE/REPZ and REPNE/REPNZ, perform the same repeat functions for certain compare-string and 
scan-string instructions. The repeat function uses rCX as a count register. The size of rCX is based on 
address size, as shown in Table 1-4 on page 10. 

REP. The REP prefix repeats its associated string instruction the number of times specified in the 
counter register (rCX). It terminates the repetition when the value in rCX reaches 0. The prefix can be 
used with the INS, LODS, MOVS, OUTS, and STOS instructions. Table 1-6 shows the valid REP 
prefix opcodes. 


Table 1-6. REP Prefix Opcodes 


Mnemonic 

Opcode 

REP INS reg/mem8, DX 

REP INSB 

F3 6C 

REP INS reg/mem 16/32, DX 

REP INSW 

REP INSD 

F3 6D 

REP LODS mem8 

REP LODSB 

F3 AC 

REP LODS meml6/32/64 

REP LODSW 

REP LODSD 

REP LODSQ 

F3 AD 

REP MOVS mem8, mem8 

REP MOVSB 

F3A4 

REP MOVS meml6/32/64, mem16/32/64 

REP MOVSW 

REP MOVSD 

REP MOVSQ 

F3 A5 

REP OUTS DX, reg/mem8 

REPOUTSB 

F3 6E 
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Table 1-6. REP Prefix Opcodes (continued) 


Mnemonic 

Opcode 

REP OUTS DX, reg/mem16/32 

REP OUTSW 

REPOUTSD 

F3 6F 

REP STOS mem8 

REP STOSB 

F3AA 

REP STOS mem 16/32/64 

REP STOSW 

REP STOSD 

REP STOSQ 

F3 AB 


REPE and REPZ. REPE and REPZ are synonyms and have identical opcodes. These prefixes repeat 
their associated string instruction the number of times specified in the counter register (rCX). The 
repetition terminates when the value in rCX reaches 0 or when the zero flag (ZF) is cleared to 0. The 
REPE and REPZ prefixes can be used with the CMPS, CMPSB, CMPSD, CMPSW, SCAS, SCASB, 
SCASD, and SCASW instructions. Table 1-7 shows the valid REPE and REPZ prefix opcodes. 


Table 1-7. REPE and REPZ Prefix Opcodes 


Mnemonic 

Opcode 

REPx CMPS mem8, mem8 

REPx CMPSB 

F3A6 

REPx CMPS meml6/32/64, mem16/32/64 

REPx CMPSW 

REPx CMPSD 

REPx CMPSO 

F3A7 

REPx SCAS mem8 

REPx SCASB 

F3AE 

REPx SCAS meml6/32/64 

REPx SCASW 

REPx SCASD 

REPx SCASO 

F3AF 


REPNE and REPNZ. REPNE and REPNZ are synonyms and have identical opcodes. These prefixes 
repeat their associated string instruction the number of times specified in the counter register (rCX). 
The repetition terminates when the value in rCX reaches 0 or when the zero flag (ZF) is set to 1. The 
REPNE and REPNZ prefixes can be used with the CMPS, CMPSB, CMPSD, CMPSW, SCAS, 
SCASB, SCASD, and SCASW instructions. Table 1-8 on page 14 shows the valid REPNE and 
REPNZ prefix opcodes. 
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Table 1-8. REPNE and REPNZ Prefix Opcodes 


Mnemonic 

Opcode 

REPNx CMPS mem8, mem8 

REPNx CMPSB 

F2A6 

REPNx CMPS meml6/32/64, mem16/32/64 

REPNx CMPSW 

REPNx CMPSD 

REPNx CMPSQ 

F2A7 

REPNx SCAS mem8 

REPNx SCASB 

F2AE 

REPNx SCAS meml6/32/64 

REPNx SCASW 

REPNx SCASD 

REPNx SCASQ 

F2AF 


Instructions that Cannot Use Repeat Prefixes. In general, the repeat prefixes should only be used 
in the string instructions listed in tables 1-6, 1-7, and 1-8 above. For other instructions (mostly SIMD 
instructions) the 66h, F2h, and F3h prefixes are used as instruction modifiers to extend the instruction 
encoding space in the OFh, 0F_38h, and 0F_3Ah opcode maps. 

Optimization of Repeats. Depending on the hardware implementation, the repeat prefixes can have 
a setup overhead. If the repeated count is variable, the overhead can sometimes be avoided by 
substituting a simple loop to move or store the data. Repeated string instructions can be expanded into 
equivalent sequences of inline loads and stores or a sequence of stores can be used to emulate a REP 
STOS. 

For repeated string moves, performance can be maximized by moving the largest possible operand 
size. For example, use REP MOVSD rather than REP MOVSW and REP MOVSW rather than REP 
MOVSB. Use REP STOSD rather than REP STOSW and REP STOSW rather than REP MOVSB. 

Depending on the hardware implementation, string moves with the direction flag (DF) cleared to 0 
(up) may be faster than string moves with DF set to 1 (down). DF = 1 is only needed for certain cases 
of overlapping REP MO VS, such as when the source and the destination overlap. 

1.2.7 REX Prefix 

The REX prefix, available in 64-bit mode, enables use of the AMD64 register and operand size 
extensions. Unlike the legacy instruction modification prefixes, REX is not a single unique value, but 
occupies a range (40h to 4Fh). Figure 1-1 on page 2 shows how the REX prefix fits within the 
encoding syntax of instructions. 

The REX prefix enables the following features in 64-bit mode: 

• Use of the extended GPR (Figure 2-3 on page 39) and YMM/XMM registers (Figure 2-8 on 
page 44). 
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• Use of the 64-bit operand size when accessing GPRs. 

• Use of the extended control and debug registers, as described in Section 2.4 “Registers” in 
Volume 2. 

• Use of the unifonn byte registers (AL-R15). 

REX contains five fields. The upper nibble is unique to the REX prefix and identifies it is as such. The 
lower nibble is divided into four 1-bit fields (W, R, X, and B). See below for a discussion of these 
fields.Figure 1-3 below shows the format of the REX prefix. Since each bit of the lower nibble can be 
a 1 or a 0, REX spans one full row of the primary opcode map occupying entries 40h through 4Fh. 


7 6 5 4 

3 

2 

i 

0 

4 

W 

R 

X 



v3_REX_byte_format.eps 

Figure 1-3. REX Prefix Format 


A REX prefix is normally required with an instruction that accesses a 64-bit GPR or one of the 
extended GPR or YMM/XMM registers. A few instructions have an operand size that defaults to (or is 
fixed at) 64 bits in 64-bit mode, and thus do not need a REX prefix. These instructions are listed in 
Table 1-9 below. 


Table 1-9. Instructions Not Requiring REX Prefix in 64-Bit Mode 


CALL (Near) 

POP reg/mem 

ENTER 

POP reg 

Jcc 

POP FS 

JrCXZ 

POP GS 

JMP (Near) 

POPF, POPFD, POPFQ 

LEAVE 

PUSH imm8 

LGDT 

PUSH imm32 

LIDT 

PUSH reg/mem 

LLDT 

PUSH reg 

LOOP 

PUSH FS 

LOOPcc 

PUSH GS 

LTR 

PUSHF, PUSHFD, PUSHFQ 

MOV CRr? 

RET (Near) 

MOV DRr? 


An instruction may have only one REX prefix which must immediately precede the opcode or first 
escape byte in the instruction encoding. The use of a REX prefix in an instruction that does not access 
an extended register is ignored. The instruction-size limit of 15 bytes applies to instructions that 
contain a REX prefix. 
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Implications for INC and DEC Instructions 

The REX prefix values are taken from the 16 single-byte INC and DEC instructions, one for each of 
the eight legacy GPRs. Therefore, these single-byte opcodes for INC and DEC are not available in 64- 
bit mode, although they are available in legacy and compatibility modes. The functionality of these 
INC and DEC instructions is still available in 64-bit mode, however, using the ModRM forms of those 
instructions (opcodes FF /0 and FF /l). 

1.2.8 VEX and XOP Prefixes 

The extended instruction encoding syntax, available in protected and long modes, provides one 2-byte 
and three 3-byte escape sequences introduced by either the VEX or XOP prefixes. These multi-byte 
sequences not only select opcode maps, they also provide instruction modifiers similar to, but in lieu 
of, the REX prefix. 

The 2-byte escape sequence initiated by the VEX C5h prefix implies a mapselect encoding of 1. The 
three-byte escape sequences, initiated by the VEX C4h prefix or the XOP (8Fh) prefix, select the target 
opcode map explicitly via the VEX/XOP.map_select field. The five-bit VEX.map select field allows 
the selection of one of 31 different opcode maps (opcode map OOh is reserved). The XOP.map select 
field is restricted to the range 08h - lFh and thus can only select one of 24 different opcode maps. 

The VEX and XOP escape sequences contain fields that extend register addressing to a total of 16, 
increase the operand specification capability to four operands, and modify the instruction operation. 

The extended SSE instruction subsets AVX, AES, CLMU, FMA, FMA4, and XOP and a few non-SSE 
instructions utilize the extended encoding syntax. See “Encoding Using the VEX and XOP Prefixes” 
on page 29 for details on the encoding of the two- and three-byte extended escape sequences. 

1.3 Opcode 

The opcode is a single byte that specifies the basic operation of an instruction. In some cases, it also 
specifies the operands for the instruction. Every instruction requires an opcode. The correspondence 
between the binary value of the opcode and the operation it represents is defined by a table called an 
opcode map. As discussed in the previous sections, the legacy prefixes 66h, F2h, and F3h and other 
fields within the instruction encoding may be used to modify the operation encoded by the opcode. 

The affect of the presence of a 66h, F2h, or F3h prefix on the operation performed by the opcode is 
represented in the opcode map by additional rows in the table indexed by the applicable prefix. The 3- 
bit reg and r/m fields of the ModRM byte (“ModRM and SIB Bytes” on page 17) are used as well in 
the encoding of certain instructions. This is represented in the opcode maps via instruction group 
tables that detail the modifications represented via the extra encoding bits. See Section A. 1, “Opcode 
Maps” of Appendix A for examples. 

Even though each instruction has a unique opcode map and opcode, assemblers often support multiple 
alternate mnemonics for the same instruction to improve the readability of assembly language code. 
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The 64-bit floating-point 3DNow! instructions utilize the two-byte escape sequence OFh, OFh to select 
the 3DNow! opcode map. For these instructions the opcode is encoded in the immediate field at the 
end of the instruction encoding. 

For details on how the opcode byte encodes the basic operation for specifc instructions, see Section 
A. 1, “Opcode Maps” of Appendix A 

1.4 ModRM and SIB Bytes 

The ModRM byte is optional depending on the instruction. When present, it follows the opcode and is 
used to specify: 

• two register-based operands, or 

• one register-based operand and a second memory-based operand and an addressing mode. 

In the encoding of some instructions, fields within the ModRM byte are repurposed to provide 
additional opcode bits used to define the instruction’s function. 

The ModRM byte is partitioned into three fields— mod, reg, and r/m. Normally the reg field specifies a 
register-based operand and the mod and r/m fields used together specify a second operand that is either 
register-based or memory-based. The addressing mode is also specified when the operand is memory- 
based. 

In 64-bit mode, the REX.R and REX.B bits augment the reg and r/m fields respectively allowing the 
specification of twice the number of registers. 

1.4.1 ModRM Byte Format 

Figure 1-4 below shows the format of a ModRM byte. 


7 6 5 4 3 2 1 0 


mod 


reg 


r/m 


REX.R, 

extend 


VEX.Ror XOP.R 
this field to 4 bits 


ModRM 


REX.B, VEX.B, orXOP.B- 

extend this field to 4 bits v3_ModRM_format.eps 


Figure 1-4. ModRM-Byte Format 


Depending on the addressing mode, the SIB byte may appear after the ModRM byte. SIB is used in the 
specification of various forms of indexed register-indirect addressing. See the following section for 
details. 
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ModRM.mod (Bits[7:6]). The mod field is used with the r/m field to specify the addressing mode for 
an operand. ModRM.mod = lib specifies the register-direct addressing mode. In the register-direct 
mode, the operand is held in the specified register. ModRM.mod values less than lib specify register- 
indirect addressing modes. In register-indirect addressing modes, values held in registers along with an 
optional displacement specified in the instruction encoding are used to calculate the address of a 
memory-based operand. Other encodings of the 5 bits {mod, r/m} are discussed below. 

ModRM.reg (Bits[5:3]). The reg field is used to specify a register-based operand, although for some 
instructions, this field is used to extend the operation encoding. The encodings for this field are shown 
in Table 1-10 below. 

ModRM.r/m (Bits[2:0]). As stated above, the r/m field is used in combination with the mod field to 
encode 32 different operand specifications (See Table 1-14 on page 21). The encodings for this field 
are shown in Table 1-10 below. 


Table 1-10. ModRM.reg and .r/m Field Encodings 


Encoded value 
(binary) 

ModRM.reg 1 

ModRM.r/m (mod = 11b) 1 

ModRM.r/m 
(mod ^ 11b) 2 

000 

rAX, MMX0, XMM0, YMM0 

rAX, MMX0, XMM0, YMM0 

[rAX] 

001 

rCX, MMX1, XMM1, YMM1 

rCX, MMX1.XMM1, YMM1 

[rCX] 

010 

rDX, MMX2, XMM2, YMM2 

rDX, MMX2, XMM2, YMM2 

[rDX] 

011 

rBX, MMX3, XMM3, YMM3 

rBX, MMX3, XMM3, YMM3 

[rBX] 

100 

AH, rSP, MMX4, XMM4, YMM4 

AH, rSP, MMX4, XMM4, YMM4 

SIB 3 

101 

CH, rBP, MMX5, XMM5, YMM5 

CH, rBP, MMX5, XMM5, YMM5 

[rBP] 4 

110 

DH, rSI, MMX6, XMM6, YMM6 

DH, rSI, MMX6, XMM6, YMM6 

[rSI] 

111 

BH, rDI, MMX7, XMM7, YMM7 

BH, rDI, MMX7, XMM7, YMM7 

[rDI] 

Notes: 

1. Specific register used is instruction-dependent. 

2. mod = 01 and mod = 10 include an offset specified by the instruction displacement field. 

The notation [*] signifies that the specified register holds the address of the operand. 

3. Indexed register-indirect addressing. SIB byte follows ModRM byte. See following section for SIB encoding. 

4. For mod = 00b, r/m = 101b signifies absolute (displacement-only) addressing in 32-bit mode or RIP-relative 

addressing in 64-bit mode, where the rBP register is not used. For mod = [01b, 10b], r/m = 101b specifies 
the base + offset addressing mode with [rBP] as the base. 


Similar to the reg field, r/m is used in some instructions to extend the operation encoding. 

1.4.2 SIB Byte Format 

The SIB byte has three fields— scale, index, and base —that define the scale factor, index-register 
number, and base-register number for the 32-bit and 64-bit indexed register-indirect addressing 
modes. 
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The basic formula for computing the effective address of a memory-based operand using the indexed 
register-indirect address modes is: 

effective_address = scale * index + base + offset 

Specific variants of this addressing mode set one or more elements of the sum to zero. 

Figure 1-5 below shows the format of the SIB byte. 


Bits: 


7 6 5 4 3 2 1 0 


scale 


index 


base 


REX.X bit of REX prefix can 
extend this field to 4 bits 


A 


REX.B bit of REX prefix can 
extend this field to 4 bits 


513-306.eps 


Figure 1-5. SIB Byte Format 


SIB.scale (Bits[7:6]). The scale field is used to specify the scale factor used in computing the 
scale*index portion of the effective address. In normal usage scale represents the size of data elements 
in an array expressed in number of bytes. SIB.scale is encoded as shown in Table 1-11 below. 


Table 1-11. SIB.scale Field Encodings 


Encoded value 
(binary) 

scale 

factor 

00 

1 

01 

2 

10 

4 

11 

8 


SIB.index (Bits[5:3]). The index field is used to specify the register containing the index portion of 
the indexed register-indirect effective address. SIB.index is encoded as shown in Table 1-12 below. 

SIB.base (Bits[2:0]). The base field is used to specify the register containing the base address 
portion of the indexed register-indirect effective address. SIB.base is encoded as shown in Table 1-12 
below. 
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Table 1-12. SIB.index and .base Field Encodings 


Encoded value 
(binary) 

SIB.index 

SIB.base 

000 

[rAX] 

[rAX] 

001 

[rCX] 

[rCX] 

010 

[rDX] 

[rDX] 

011 

[rBX] 

[rBX] 

100 

(none) 1 

[rSP] 

101 

[rBP] 

[rBP], (none) 2 

110 

[rSI] 

DH, [rSI] 

111 

[rDI] 

BH, [rDI] 

Notes: 

1. Register specification is null. The scale*index portion of the indexed register-indirect effec¬ 

tive address is set to 0. 

2. If ModRM.mod = 00b, the register specification is null. The base portion of the indexed reg¬ 

ister-indirect effective address is set to 0. Otherwise, base encodes the rBP register as 
the source of the base address used in the effective address calculation. 


Table 1-13. SIB.base encodings for ModRM.r/m = 100b 



SIB base Field 

mod 

000 

001 

010 

011 

100 

101 

110 

111 

00 






disp32 



01 

[rAX] 

[rCX] 

[rDX] 

[rBX] 

[rSP] 

[rBP]+disp8 

[rSI] 

[rDI] 

10 






[rBP]+disp32 



11 

(not applicable) 


More discussion of operand addressing follows in the next two sections. 

1.4.3 Operand Addressing in Legacy 32-bit and Compatibility Modes 

The mod and r/m fields of the ModRM byte provide a total of five bits used to encode 32 operand 
specification and memory addressing modes. Table 1-14 below shows these encodings. 


20 


Instruction Encoding 





24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


Table 1-14. Operand Addressing Using ModRM and SIB Bytes 


ModRM.mod 

ModRM.r/m 

Register / Effective Address 

00 

000 

[rAX] 

001 

[rCX] 

010 

[rDX] 

Oil 

[rBX] 

100 

SIB 1 

101 

disp32 

110 

[rSI] 

111 

[i-DI] 

01 

000 

[rAX]+ofep8 

001 

[rCX]+d/'sp8 

010 

[rDX]+d/sp8 

Oil 

[rBX]+ofep8 

100 

SIB +disp8 2 

101 

[rBP ]+disp8 

110 

[rSI ]+disp8 

111 

[rDI ]+disp8 

10 

000 

[rA X]+disp32 

001 

[rC X]+disp32 

010 

[rD X]+disp32 

Oil 

[rB X]+disp32 

100 

SIB +disp32 3 

101 

[rBP ]+disp32 

110 

[rSI ]+disp32 

111 

[rDI ]+disp32 

Notes: 

0. In the following notes, scaledjndex = SIB.index * (1 « SIB.scale). 

1. SIB byte follows ModRM byte. Effective address is calculated using 
scaled_index+base. When SIB. base = 101b, addressing mode depends on 
ModRM.mod. See Table 1-13 above. 

2. SIB byte follows ModRM byte. Effective address is calculated using scaledjn- 

dex+base+8-bit_offset. One-byte Displacement field provides the offset. 

3. SIB byte follows ModRM byte. Effective address is calculated using scaledjn- 

dex+base+32-bit_offset. Four-byte Displacement field provides the offset. 
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Table 1-14. Operand Addressing Using ModRM and SIB Bytes (continued) 


ModRM.mod 

ModRM.r/m 

Register / Effective Address 

11 

000 

AL/rAX/M MX0/XM M0/YMM0 

001 

CL/rCX/MMXI/XMMI/YMMI 

010 

DL/rDX/MMX2/XMM2/YMM2 

Oil 

B L/rBX/M MX3/XM M3/YMM3 

100 

AH/SPL/rSP/MMX4/XMM4/YMM4 

101 

CH/BPL/rBP/MMX5/XMM5/YMM5 

110 

DH/S1 L/rS I/M MX6/XMM 6/YMM6 

111 

BH/DIL/rDI/MMX7/XMM7/YMM7 


Notes: 


0. In the following notes, scaledjndex = SIB.index * (1 « SIB.scale). 

1. SIB byte follows ModRM byte. Effective address is calculated using 
scaled_index+base. When SIB. base = 101b, addressing mode depends on 
ModRM.mod. See Table 1-13 above. 

2. SIB byte follows ModRM byte. Effective address is calculated using scaledJn- 

dex+base+8-bit_offset. One-byte Displacement field provides the offset. 

3. SIB byte follows ModRM byte. Effective address is calculated using scaledJn- 

dex+base+32-bit_offset. Four-byte Displacement field provides the offset. 

Note that the addressing mode mod = 1 lb is a register-direct mode, that is, the operand is 
the specified register, while the modes mod = [00b: 10b] specify different addressing 
memory-based operand. 

For mod = 1 lb, the register containing the operand is specified by the r/m field. For the 
(mod = [00b: 10b]), the mod and r/m fields are combined to specify the addressing mode for the 
memory-based operand. Most are register-indirect addressing modes meaning that the address of the 
memory-based operand is contained in the register specified by r/m. For these register-indirect modes, 
mod = 01b and mod = 10b include an offset encoded in the displacement field of the instruction. 

The encodings {mod ^ lib, r/m = 100b} specify the indexed register-indirect addressing mode in 
which the target address is computed using a combination of values stored in registers and a scale 
factor encoded directly in the SIB byte. For these addressing modes the effective address is given by 
the formula: 


contained in 
modes for a 

other modes 


effective_address = scale * index + base + offset 

Scale is encoded in SIB.scale field. Index is contained in the register specified by SIB.index field and 
base is contained in the register specified by SIB.base field. Offset is encoded in the displacement field 
of the instruction using either one or four bytes. 

If {mod, r/m} = 00100b, the offset portion of the formula is set to 0. For {mod, r/m} = 01100b and 
{mod, r/m} =10100b, offset is encoded in the one- or 4-byte displacement field of the instruction. 

Finally, the encoding {mod, r/m} = 00101b specifies an absolute addressing mode. In this mode, the 
address is provided directly in the instruction encoding using a 4-byte displacement field. In 64-bit 
mode this addressing mode is changed to RIP-relative (see “RIP-Relative Addressing” on page 24). 
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1.4.4 Operand Addressing in 64-bit Mode 

AMD64 architecture doubles the number of GPRs and increases their width to 64-bits. It also doubles 
the number of YMM/XMM registers. In order to support the specification of register operands 
contained in the eight additional GPRs or YMM/XMM registers and to make the additional GPRs 
available to hold addresses to be used in the addressing modes, the REX prefix provides the R, X, and 
B bit fields to extend the reg, r/m, index, and base fields of the ModRM and SIB bytes in the various 
operand addressing modes to four bits. A fourth REX bit field (W) allows instruction encodings to 
specify a 64-bit operand size. 

Table 1-15 below and the sections that follow describe each of these bit fields. 


Table 1-15. REX Prefix-Byte Fields 


Mnemonic 

Bit Position(s) 

Definition 

— 

7:4 

0100 (4h) 

REX.W 

3 

0 = Default operand size 

1 = 64-bit operand size 

REX.R 

2 

1-bit (msb) extension of the ModRM reg 
field 1 , permitting access to 16 registers. 

REX.X 

1 

1-bit (msb) extension of the SIB index field 1 , 
permitting access to 16 registers. 

REX.B 

0 

1-bit (msb) extension of the ModRM r/m 
field 1 , SIB base field 1 , or opcode reg field, 
permitting access to 16 registers. 

Notes: 

1. For a description of the ModRM and SIB bytes, see “ModRM and SIB Bytes” on 
page 17. 


REX.W: Operand width (Bit 3). Setting the REX.W bit to 1 specifies a 64-bit operand size. Like the 
existing 66h operand-size override prefix, the REX 64-bit operand-size override has no effect on byte 
operations. For non-byte operations, the REX operand-size override takes precedence over the 66h 
prefix. If a 66h prefix is used together with a REX prefix that has the W bit set to 1, the 66h prefix is 
ignored. However, if a 66h prefix is used together with a REX prefix that has the W bit cleared to 0, 
the 66h prefix is not ignored and the operand size becomes 16 bits. 

REX.R: Register field extension (Bit 2). The REX.R bit adds a 1-bit extension (in the most 
significant bit position) to the ModRM.reg field when that field encodes a GPR, YMM/XMM, control, 
or debug register. REX.R does not modify ModRM.reg when that field specifies other registers or is 
used to extend the opcode. REX.R is ignored in such cases. 

REX.X: Index field extension (Bit 1). The REX.X bit adds a 1-bit (msb) extension to the SIB.index 
field. See “ModRM and SIB Bytes” on page 17. 
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REX.B: Base field extension (Bit 0). The REX.B bit adds a 1-bit (msb) extension to either the 
ModRM.r/m field to specify a GPR or XMM register, or to the SIB.base field to specify a GPR. (See 
Table 2-2 on page 56 for more about the B bit.) 

1.5 Displacement Bytes 

A displacement (also called an offset) is a signed value that is added to the base of a code segment 
(absolute addressing) or to an instruction pointer (relative addressing), depending on the addressing 
mode. The size of a displacement is 1, 2, or 4 bytes. If an addressing mode requires a displacement, the 
bytes (1, 2, or 4) for the displacement follow the opcode, ModRM, or SIB byte (whichever comes last) 
in the instruction encoding. 

In 64-bit mode, the same ModRM and SIB encodings are used to specify displacement sizes as those 
used in legacy and compatibility modes. However, the displacement is sign-extended to 64 bits during 
effective-address calculations. Also, in 64-bit mode, support is provided for some 64-bit displacement 
and immediate forms of the MOV instruction. See “Immediate Operand Size” in Volume 1 for more 
information on this. 

1.6 Immediate Bytes 

An immediate is a value—typically an operand value—encoded directly into the instruction. 
Depending on the opcode and the operating mode, the size of an immediate operand can be 1, 2, 4, or 8 
bytes. 64-bit immediates are allowed in 64-bit mode on MOV instructions that load GPRs, otherwise 
they are limited to 4 bytes. See “Immediate Operand Size” in Volume 1 for more information. 

If an instruction takes an immediate operand, the bytes (1, 2, 4, or 8) for the immediate follow the 
opcode, ModRM, SIB, or displacement bytes (whichever come last) in the instruction encoding. Some 
128-bit media instructions use the immediate byte as a condition code. 

1.7 RIP-Relative Addressing 

In 64-bit mode, addressing relative to the contents of the 64-bit instruction pointer (program 
counter)—called RIP-relative addressing or PC-relative addressing—is implemented for certain 
instructions. In such cases, the effective address is formed by adding the displacement to the 64-bit 
RIP of the next instruction. 

In the legacy x86 architecture, addressing relative to the instruction pointer is available only in control- 
transfer instructions. In the 64-bit mode, any instruction that uses ModRM addressing can use RIP- 
relative addressing. This feature is particularly useful for addressing data in position-independent code 
and for code that addresses global data. 

Without RIP-relative addressing, ModRM instructions address memory relative to zero. With RIP- 
relative addressing, ModRM instructions can address memory relative to the 64-bit RIP using a signed 
32-bit displacement. This provides an offset range of ±2 Gbytes from the RIP. 
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Programs usually have many references to data, especially global data, that are not register-based. To 
load such a program, the loader typically selects a location for the program in memory and then adjusts 
program references to global data based on the load location. RIP-relative addressing of data makes 
this adjustment unnecessary. 

1.7.1 Encoding 

Table 1-16 shows the ModRM and SIB encodings for RIP-relative addressing. Redundant forms of 
32-bit displacement-only addressing exist in the current ModRM and SIB encodings. There is one 
ModRM encoding with several SIB encodings. RIP-relative addressing is encoded using one of the 
redundant forms. In 64-bit mode, the ModRM disp32 (32-bit displacement) encoding ({mod,r/m} = 
00101b) is redefined to be RIP + disp32 rather than displacement-only. 


Table 1-16. Encoding for RIP-Relative Addressing 


ModRM 

SIB 

Legacy and 
Compatibility Modes 

64-bit Mode 

Additional 64-bit 
Implications 

• mod = 00 

• r/m = 101 

not present 

disp32 

RIP + disp32 

Zero-based (normal) 
displacement addressing 
must use SIB form (see 
next row). 

• mod = 00 

• r/m = 100 1 

• base = 101 2 

• index =100 3 

• scale = xx 

disp32 

Same as Legacy 

None 

Notes: 

1. Encodes the indexed register-indirect addressing mode with 32-bit offset. 

2. Base register specification is null (base portion of effective address calculation is set to 0) 

3. index register specification is null (scale*index portion of effective address calculation is set to 0) 


1.7.2 REX Prefix and RIP-Relative Addressing 

ModRM encoding for RIP-relative addressing does not depend on a REX prefix. In particular, the r/m 
encoding of 101, used to select RIP-relative addressing, is not affected by the REX prefix. For 
example, selecting R13 (REX.B = 1, r/m =101) with mod = 00 still results in RIP-relative addressing. 

The four-bit r/m field of ModRM is not fully decoded. Therefore, in order to address R13 with no 
displacement, software must encode it as R13 + 0 using a one-byte displacement of zero. 

1.7.3 Address-Size Prefix and RIP-Relative Addressing 

RIP-relative addressing is enabled by 64-bit mode, not by a 64-bit address-size. Conversely, use of the 
address-size prefix (“Address-Size Override Prefix” on page 9) does not disable RIP-relative 
addressing. The effect of the address-size prefix is to truncate and zero-extend the computed effective 
address to 32 bits, like any other addressing mode. 
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1.8 Encoding Considerations Using REX 

Figure 1-6 on page 28 shows four examples of how the R, X, and B bits of the REX prefix are 
concatenated with fields from the ModRM byte, SIB byte, and opcode to specify register and memory 
addressing. 

1.8.1 Byte-Register Addressing 

In the legacy architecture, the byte registers (AH, AL, BH, BL, CH, CL, DH, and DL, shown in 
Figure 2-2 on page 38) are encoded in the ModRM reg or r/m field or in the opcode reg field as 
registers 0 through 7. The REX prefix provides an additional byte-register addressing capability that 
makes the least-significant byte of any GPR available for byte operations (Figure 2-3 on page 39). 
This provides a uniform set of byte, word, doubleword, and quadword registers better suited for 
register allocation by compilers. 

1.8.2 Special Encodings for Registers 

Readers who need to know the details of instruction encodings should be aware that certain 
combinations of the ModRM and SIB fields have special meaning for register encodings. For some of 
these combinations, the instruction fields expanded by the REX prefix are not decoded (treated as 
don’t cares), thereby creating aliases of these encodings in the extended registers. Table 1-17 on 
page 27 describes how each of these cases behaves. 
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Table 1-17. Special REX Encodings for Registers 


ModRM and SIB 
Encodings 2 

Meaning in Legacy and 
Compatibility Modes 

Implications in Legacy 
and Compatibility 
Modes 

Additional REX 
Implications 

ModRM Byte: 

• mod # 11 

• r/m 1 = 100 (ESP) 

SIB byte is present. 

SIB byte is required for 
ESP-based addressing. 

REX prefix adds a fourth 
bit (b), which is decoded 
and modifies the base 
register in the SIB byte. 
Therefore, the SIB byte is 
also required for R12- 
based addressing. 

ModRM Byte: 

• mod = 00 

• r/m 1 =x101 (EBP) 

Base register is not used. 

Using EBP without a 
displacement must be 
done by setting mod = 01 
with a displacement of 0 
(with or without an index 
register). 

REX prefix adds a fourth 
bit (x), which is not 
decoded (don’t care). 
Therefore, using RBP or 
R13 without a 
displacement must be 
done via mod = 01 with a 
displacement of 0. 

SIB Byte: 

• index 1 =x100 (ESP) 

Index register is not used. 

ESP cannot be used as 
an index register. 

REX prefix adds a fourth 
bit (x), which is decoded. 
Therefore, there are no 
additional implications. 

The expanded index field 
is used to distinguish RSP 
from R12, allowing R12 to 
be used as an index. 

SIB Byte: 

• base = bl 01 (EBP) 

• ModRM.mod = 00 

Base register is not used 
if ModRM.mod = 00. 

Base register depends on 
mod encoding. Using 

EBP with a scaled index 
and without a 
displacement must be 
done by setting mod = 01 
with a displacement of 0. 

REX prefix adds a fourth 
bit (b), which is not 
decoded (don’t care). 
Therefore, using RBP or 
R13 without a 
displacement must be 
done via mod = 01 with a 
displacement of 0 (with or 
without an index register). 

Notes: 

1. The REX-prefix bit is shown in the fourth (most-significant) bit position of the encodings for the ModRM r/m, SIB 
index, and SIB base fields. The lower-case “x” for ModRM r/m (rather than the upper-case “B” shown in Figure 1-6 
on page 28) indicates that the REX-prefix bit is not decoded (don’t care). 

2. For a description of the ModRM and SIB bytes, see “ModRM and SIB Bytes” on page 17. 
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Examples of Operand Addressing Extension Using REX 


Case 1: Register-Register Addressing (No Memory Operand) 


ModRM Byte 

REX Prefix Opcode mod reg r/m 


4WRXB 

1 

11 

rrr 

bbb | 














A 







4 

/ 


Rrrr Bbbb 


REX.X is not used 


Case 2: Memory Addressing Without an SIB Byte 


ModRM Byte 


REX Prefix 

Opcode 

moc 

reg 

r/m 

4WRXB 

_1 

111 

rrr 

bbb | 


''4 


REX.X is not used 
ModRM reg field != 100 


Rrrr Bbbb 


Case 3: Memory Addressing With an SIB Byte 

ModRM Byte 


SIB Byte 


REX Prefix 

Opcode 

moc 

reg 

r/m scale index base 

4WRXB 

_1 

111 

rrr 

100 | 

bb 

XXX 

bbb | 


'■'A 

Rrrr 


'% ''4 

Xxxx Bbbb 


Case 4: Register Operand Coded in 
REX Prefix op reg 


4WRXB 


bbb | 





Bbbb 


Opcode Byte 

REX.R is not used 
REX.X is not used 

v3_REX_reg_addr.eps 


Figure 1-6. Encoding Examples Using REX R, X, and B Bits 
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1.9 Encoding Using the VEX and XOP Prefixes 

An extended escape sequence is introduced by an encoding escape prefix which establishes the context 
and the format of the bytes that follow. The currently defined prefixes fall in two classes: the XOP and 
the VEX prefixes (of which there are two). The XOP prefix and the VEX C4h prefix introduce a three 
byte sequence with identical syntax, while the VEX C5h prefix introduces a two-byte escape sequence 
with a different syntax. 

These escape sequences supply fields used to extend operand specification as well as provide for the 
selection of alternate opcode maps. Encodings support up to two additional operands and the 
addressing of the extended (beyond 7) registers. The specification of two of the operands is 
accomplished using the legacy ModRM and optional SIB bytes with the reg, r/m, index, and base 
fields extended by one bit in a manner analogous to the REX prefix. 

The encoding of the extended SSE instructions utilize extended escape sequences. XOP instructions 
use three-byte escape sequences introduced by the XOP prefix. The AVX, FMA, FMA4, and CLMUL 
instruction subsets use three-byte or two-byte escape sequences introduced by the VEX prefixes. 

1.9.1 Three-Byte Escape Sequences 

All the extended instructions can be encoded using a three-byte escape sequence, but certain VEX- 
encoded instructions that comply with the constraints described below in Section 1.9.2, “Two-Byte 
Escape Sequence” can also utilize a two-byte escape sequence. Figure 1-7 below shows the format of 
the three-byte escape sequence which is common to the XOP and VEX-based encodings. 

Byte 0 Byte 1 Byte 2 


7 0 

7 6 5 4 0 

7 6 3 2 1 0 

Encoding escape prefix 

R 

X 

B 

map_select 

W 

vvvv 

L 

PP 


Figure 1-7. VEX/XOP Three-byte Escape Sequence Format 


Byte 

Bit 

Mnemonic 

Description 

0 

[7:0] 

VEX, XOP 

Value specific to the extended instruction set 

1 

[7] 

R 

Inverted one-bit extension of ModRM reg field 

[6] 

X 

Inverted one-bit extension of SIB index field 

[5] 

B 

Inverted one-bit extension, r/m field or SIB base 
field 

[4:0] 

map_select 

Opcode map select 


Instruction Encoding 


29 




AMpg 

AMD64 Technology 


24594 — Rev. 3.28—September 2019 


Byte 

Bit 

Mnemonic 

Description 

2 

[7] 

W 

Default operand size override for a general 
purpose register to 64-bit size in 64-bit mode; 
operand configuration specifier for certain 
YMM/XMM-based operations. 

[6:3] 

vvvv 

Source or destination register selector, in ones’ 
complement format 

[2] 

L 

Vector length specifier 

[1:0] 

PP 

Implied 66, F2, or F3 opcode extension 


Table 1-18. Three-byte Escape Sequence Field Definitions 


Byte 0 (VEX/XOP Prefix) 

Byte 0 is the encoding escape prefix byte which introduces the encoding escape sequence and 
establishes the context for the bytes that follow. The VEX and XOP prefixes have the following 
encodings: 

• VEX prefix is encoded as C4h 

• XOP prefix is encoded as 8Fh 

Byte 1 

VEX/XOP.R (Bit 7). The bit-inverted equivalent of the REX.R bit. A one-bit extension of the 
ModRM.reg field in 64-bit mode, pennitting access to 16 YMM/XMM and GPR registers. In 32-bit 
protected and compatibility modes, the value must be 1. 

VEX/XOP.X (Bit 6). The bit-inverted equivalent of the REX.X bit. A one-bit extension of the 
SIB.index field in 64-bit mode, permitting access to 16 YMM/XMM and GPR registers. In 32-bit 
protected and compatibility modes, this value must be 1. 

VEX/XOP.B (Bit 5). The bit-inverted equivalent of the REX.B bit, available only in the 3-byte prefix 
format. A one-bit extension of either the ModRM.r/m field, to specify a GPR or XMM register, or of 
the SIB base field, to specify a GPR. This pennits access to all 16 GPR and YMM/XMM registers. In 
32-bit protected and compatibility modes, this bit is ignored. 

VEX/XOP.map_select (Bits [4:0]). The five-bit mapselect field is used to select an alternate 
opcode map. The map select encoding spaces for VEX and XOP are disjoint. Table 1-19 below lists 
the encodings for VEX.map select and Table 1-20 lists the encodings for XOP.map select. 


Table 1-19. VEX.map select Encoding 


Binary Value 

Opcode Map 

Analogous Legacy Opcode Map 

00000 

Reserved 

- 

00001 

VEX opcode map 1 

Secondary (“two-byte”) opcode map 
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Table 1-19. VEX.mapselect Encoding 


Binary Value 

Opcode Map 

Analogous Legacy Opcode Map 

00010 

VEX opcode map 2 

0F_38h (“three-byte”) opcode map 

00011 

VEX opcode map 3 

0F_3Ah (“three-byte”) opcode map 

00100- 11111 

Reserved 

- 


Table 1-20. XOP.mapselect Encoding 


Binary Value 

Opcode Map 

00000-00111 

Reserved 

01000 

XOP opcode map 8 

01001 

XOP opcode map 9 

01010 

XOP opcode map 10 (Ah) 

01011 - 11111 

Reserved 


AVX instructions are encoded using the VEX opcode maps 1-3. The AVX instruction set includes 
instructions that provide operations similar to most legacy SSE instructions. For those AVX 
instructions that have an analogous legacy SSE instruction, the VEX opcode maps use the same binary 
opcode value and modifiers as the legacy version. The correspondence between the VEX opcode maps 
and the legacy opcode maps are shown in Table 1-19 above. 

VEX opcode maps 1-3 are also used to encode the FMA4 and FMA instructions. In addition, not all 
legacy SSE instructions have AVX equivalents. Therefore, the VEX opcode maps are not the same as 
the legacy opcode maps. 

The XOP opcode maps are unique to the XOP instructions. The XOP.map select value is restricted to 
the range [08h: lFh]. If the value of the XOP.mapselect field is less than 8, the first two bytes of the 
three-byte XOP escape sequence are interpreted as a form of the POP instruction. 

Both legacy and extended opcode maps are covered in detail in Appendix A. 

Byte 2 

VEX/XOP.W (Bit 7). Function is instruction-specific. The bit is often used to configure source 
operand order. 

VEX/XOP.vvvv (Bits [6:3]). Used to specify an additional operand for three and four operand 
instructions. Encodes an XMM or YMM register in inverted ones’ complement form, as shown in 
Table 1-21. 
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Table 1-21. VEX/XOP.vvvv Encoding 


Binary Value 

Register 

Binary Value 

Register 

0000 

XMM15/YMM15 

1000 

XMM07/YMM07 

0001 

XMM14/YMM14 

1001 

XMM06/YMM06 

0010 

XMM13/YMM13 

1010 

XMM05/YMM05 

0011 

XMM12/YMM12 

1011 

XMM04/YMM04 

0100 

XMM 11/YMM 11 

1100 

XMM03/YMM03 

0101 

XMM10/YMM10 

1101 

XMM02/YMM02 

0110 

XMM09/YMM09 

1110 

XMM01/YMM01 

0111 

XMM08/YMM08 

1111 

XMM00/YMM00 


Values OOOOh to 011 lh are not valid in 32-bit modes, vvvv is typically used to encode the first source 
operand, but for the VPSLLDQ, VPSRLDQ, VPSRLW, VPSRLD, VPSRLQ, VPSRAW, VPSRAD, 
VPSLLW, VPSLLD, and VPSLLQ shift instructions, the field specifies the destination register. 

VEX/XOP.L (Bit 2). L = 0 specifies 128-bit vector length (XMM registers/128-bit memory 
locations). L=1 specifies 256-bit vector length (YMM registers/256-bit memory locations). For SSE or 
XOP instructions with scalar operands, the L bit is ignored. Some vector SSE instructions support only 
the 128 bit vector size. For these instructions, L is cleared to 0. 

VEX/XOP.pp (Bits [1:0]). Specifies an implied 66h, F2h, or F3h opcode extension which is used in a 
way analogous to the legacy instruction encodings to extend the opcode encoding space. The 
correspondence between the encoding of the VEX/XOP.pp field and its function as an opcode modifier 
is shown in Table 1-22. The legacy prefixes 66h, F2h, and F3h are not allowed in the encoding of 
extended instructions. 


Table 1-22. VEX/XOP.pp Encoding 


Binary Value 

Implied Prefix 

00 

None 

01 

66h 

10 

F3h 

11 

F2h 


1.9.2 Two-Byte Escape Sequence 

All VEX-encoded instructions can be encoded using the three-byte escape sequence, but certain 
instructions can also be encoded utilizing a more compact, two-byte VEX escape sequence. The 
fonnat of the two-byte escape sequence is shown in Figure 1-8 below. 
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Byte 0 

7 

0 

7 

Byte 1 

6 3 

2 

1 0 

VEX 

R 

vvvv 

L 

PP 


Figure 1-8. VEX Two-byte Escape Sequence Format 


Prefix Byte 

Bit 

Mnemonic 

Description 

0 

[7:0] 

VEX 

VEX 2-byte encoding escape prefix 

1 

[7] 

R 

Inverted one-bit extension of ModRM.reg field 

[6:3] 

vvvv 

Source or destination register selector, in ones’ 
complement format. 

[2] 

L 

Vector length specifier 

[1:0] 

PP 

Implied 66, F2, or F3 opcode extension. 


Table 1-23. VEX Two-byte Escape Sequence Field Definitions 


Byte 0 (VEX Prefix) 

The VEX prefix for the two-byte escape sequence is encoded as C5h. 

Byte 1 

Note that the bit 7 of this byte is used to encode VEX.R instead of VEX. W as in the three-byte escape 
sequence form. The R, vvvv, L, and pp fields are defined as in the three-byte escape sequence. 

When the two-byte escape sequence is used, specific fields from the three-byte format take on fixed 
values as shown in Table 1-24 below. 


Table 1-24. Fixed Field Values for VEX 2-Byte Format 


VEX Field 

Value 

X 

1 

B 

1 

W 

0 

map_select 

00001b 


Although they may be encoded using the VEX three-byte escape sequence, all instructions that 
conform with the constraints listed in Table 1-24 may be encoded using the two-byte escape sequence. 
Note that the implied value of mapselect is 00001b, which means that only instructions included in 
the VEX opcode map 1 may be encoded using this format. 


VEX-encoded instructions that use the other defined values of map select (00010b and 00011b) 
cannot be encoded using this a two-byte escape sequence format. Note that the VEX.pp field value is 
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explicitly encoded in this fonn and can be used to specify any of the implied legacy prefixes as defined 
in Table 1-22. 
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2 Instruction Overview 


2.1 Instruction Groups 

For easier reference, the instruction descriptions are divided into five groups based on usage. The 
following sections describe the function, mnemonic syntax, opcodes, affected flags, and possible 
exceptions generated by all instructions in the AMD64 architecture: 

• Chapter 3, “General-Purpose Instruction Reference ” —The general-purpose instructions are used 
in basic software execution. Most of these load, store, or operate on data in the general-purpose 
registers (GPRs), in memory, or in both. Other instructions are used to alter sequential program 
flow by branching to other locations within the program or to entirely different programs. 

• Chapter 4, “System Instruction Reference ” —The system instructions establish the processor 
operating mode, access processor resources, handle program and system errors, and manage 
memory. 

• “SSE Instruction Reference” in Volume 4 —The Streaming SIMD Extensions (SSE) instructions 
load, store, or operate on data located in the YMM/XMM registers. These instructions define both 
vector and scalar operations on floating-point and integer data types. They include the SSE and 
SSE2 instructions that operate on the YMM/XMM registers. Some of these instructions convert 
source operands in YMM/XMM registers to destination operands in GPR, MMX, or x87 registers 
or otherwise affect YMM/XMM state. 

• “64-BitMedia Instruction Reference” in Volume 5 —The 64-bit media instructions load, store, or 
operate on data located in the 64-bit MMX registers. These instructions define both vector and 
scalar operations on integer and floating-point data types. They include the legacy MMX™ 
instructions, the 3DNow!™ instructions, and the AMD extensions to the MMX and 3DNow! 
instruction sets. Some of these instructions convert source operands in MMX registers to 
destination operands in GPR, YMM/XMM, or x87 registers or otherwise affect MMX state. 

• “x87 Floating-Point Instruction Reference” in Volume 5 —The x87 instructions are used in legacy 
floating-point applications. Most of these instructions load, store, or operate on data located in the 
x87 ST(0)-ST(7) stack registers (the FPR0-FPR7 physical registers). The remaining instructions 
within this category are used to manage the x87 floating-point environment. 

The description of each instruction covers its behavior in all operating modes, including legacy mode 
(real, virtual-8086, and protected modes) and long mode (compatibility and 64-bit modes). Details of 
certain kinds of complex behavior—such as control-flow changes in CALL, INT, or FXSAVE 
instructions—have cross-references in the instruction-detail pages to detailed descriptions in volumes 
1 and 2. 

Two instructions—CMPSD and MOVSD—use the same mnemonic for different instructions. 
Assemblers can distinguish them on the basis of the number and type of operands with which they are 
used. 
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2.2 Reference-Page Format 

Figure 2-1 on page 37 shows the format of an instruction-detail page. The instruction mnemonic is 
shown in bold at the top-left, along with its name. In this example, POPFD is the mnemonic and POP 
to EFLAGS Doubleword is the name. Next, there is a general description of the instruction’s operation. 
Many descriptions have cross-references to more detail in other parts of the manual. 

Beneath the general description, the mnemonic is shown again, together with the related opcode(s) and 
a description summary. Related instructions are listed below this, followed by a table showing the 
flags that the instruction can affect. Finally, each instruction has a summary of the possible exceptions 
that can occur when executing the instruction. The columns labeled “Real” and “Virtual-8086” apply 
only to execution in legacy mode. The column labeled “Protected” applies both to legacy mode and 
long mode, because long mode is a superset of legacy protected mode. 

The 128-bit and 64-bit media instructions also have diagrams illustrating the operation. A few 
instructions have examples or pseudocode describing the action. 
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Mnemonic and any operands 


Opcode 


Description of operation 


24594 Rev. 3.07 September 2003 


-A AM 

Converts the value in the All 
AH (most significant) and 

AH = (AL/lOd) 

AL = (AL mod lOd). 


In most modern assemblers) 
by coding the instruction di 
immediate byte value ( ib) sj 
octal, D40Ah for decimal, ai 

Using this instruction in 64 


AMD J 

AMD64 Technology 


ASCII Adjust After Multiply 

L register from binary to two unpacked BCD digits in the 
L (least significant) registers using the following formula: 


the AAM instruction adjusts to base-10 values. However, 
rectly in binary, it can adjust to any base specified by the 
affixed onto the D4h opcode. For example, code D408h for 
id D40Ch for duodecimal (base 12). 

bit mode generates an invalid-opcode exception. 


Mnemonic 

Opcode 

Description 

-AAM 

D4 0A 

Create a pair of unpacked 



(Invalid in 64-bit mode.) 

(None) 

D4 ib 

Create a pair of unpacked 



(Invalid in 64-bit mode.) 


Related Instructions 

AAA, AAD, AAS 

rFLAGS Affected 


_“M” means the flag is either set or 
cleared, depending on the result. 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









U 




M 

M 

U 

M 

U 

21 

20 

19 

18 

17 

16 

14 

13-12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31-22,15,5,3, and 1 are reserved. A flag set to 1 or cleared to 0 is M. Unaffected flags are blank Undefined flags are U. | 


- Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Divide by zero, #DE 

X 

X 

X 

8-bit immediate value was 0. 

Invalid opcode, #UD 



X 

This instruction was executed in 64-bit mode. 


Possible exceptions 
and causes, by mode of 
operation 


AAM 


“Protected” column 
covers both legacy 
and long mode 


Alphabetic mnemonic locator 


Figure 2-1. Format of Instruction-Detail Pages 
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2.3 Summary of Registers and Data Types 

This section summarizes the registers available to software using the five instruction subsets described 
in “Instruction Groups” on page 35. For details on the organization and use of these registers, see their 
respective chapters in volumes 1 and 2. 

2.3.1 General-Purpose Instructions 

Registers. The size and number of general-purpose registers (GPRs) depends on the operating 
mode, as do the size of the flags and instruction-pointer registers. Figure 2-2 shows the registers 
available in legacy and compatibility modes. 


register 

encoding 

high 

8-bit 

low 

8-bit 

16-bit 

32-bit 

0 


AH ( 4 ) 

AL 

AX 

EAX 

3 


BH ( 7 ) 

BL 

BX 

EBX 

1 


CH(5) 

CL 

CX 

ECX 

2 


DH(6) 

DL 

DX 

EDX 

6 


SI 

SI 

ESI 

7 


Dl 

Dl 

EDI 

5 


BP 

BP 

EBP 

4 


SP 

SP 

ESP 


31 16 

15 

0 





FLAGS 

FLAGS 

EFLAGS 



IP 

IP 

EIP 


31 


0 




513-311.eps 

Figure 2-2. General Registers in Legacy and Compatibility Modes 

Figure 2-3 on page 39 shows the registers accessible in 64-bit mode. Compared with legacy mode, 
registers become 64 bits wide, eight new data registers (R8-R15) are added and the low byte of all 16 
GPRs is available for byte operations, and the four high-byte registers of legacy mode (AH, BH, CH, 
and DH) are not available if the REX prefix is used. The high 32 bits of doubleword operands are zero- 
extended to 64 bits, but the high bits of word and byte operands are not modified by operations in 64- 
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bit mode. The RFLAGS register is 64 bits wide, but the high 32 bits are reserved. They can be written 
with anything but they read as zeros (RAZ). 


oi 

c 

T3 

O 

u 

c 

LU 

1_ 

o 

+-» 

in 

'oi 

QJ 

CC 



_ zero-extended _ 

for 32-bit operands 

■*— not modified for 16-bit c 

operands —*• 


low 

8 bits 



0 



AH* 

AL 

3 



BH* 

BL 

1 



CH* 

CL 

2 



DH* 

DL 

6 




SIL** 

7 




DIL** 

5 




BPL** 

4 




SPL** 

8 




R8B 

9 




R9B 

10 




R10B 

11 




R11B 

12 




R12B 

13 




R13B 

14 




R14B 

15 




R15B 

63 32 

31 16 

15 8 

7 0 


0 





63 32 31 


0 


16-bit 

32-bit 

64-bit 

AX 

EAX 

RAX 

BX 

EBX 

RBX 

CX 

ECX 

RCX 

DX 

EDX 

RDX 

SI 

ESI 

RSI 

Dl 

EDI 

RDI 

BP 

EBP 

RBP 

SP 

ESP 

RSP 

R8W 

R8D 

R8 

R9W 

R9D 

R9 

R10W 

R10D 

RIO 

R11W 

R11D 

R11 

R12W 

R12D 

R12 

R13W 

R13D 

R13 

R14W 

R14D 

R14 

R15W 

R15D 

R15 


RFLAGS 

RIP 


* Not addressable in REX prefix instruction forms 
** Only addressable in REX prefix instruction forms 

GPRs_64b_mode.eps 


Figure 2-3. General Registers in 64-Bit Mode 


For most instructions running in 64-bit mode, access to the extended GPRs requires a either a REX 
instruction modification prefix or extended encoding encoding using the VEX or XOP sequences 
(page 14). 
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Figure 2-4 shows the segment registers which, like the instruction pointer, are used by all instructions. 
In legacy and compatibility modes, all segments are accessible. In 64-bit mode, which uses the flat 
(non-segmented) memory model, only the CS, FS, and GS segments are recognized, whereas the 
contents of the DS, ES, and SS segment registers are ignored (the base for each of these segments is 
assumed to be zero, and neither their segment limit nor attributes are checked). For details, see 
“Segmented Virtual Memory” in Volume 2. 


Legacy Mode and 64-Bit 

Compatibility Mode Mode 


CS 


CS 

(Attributes only) 

DS 


ignored 

ES 


ignored 

FS 


FS 

(Base only) 

GS 


GS 

(Base only) 

SS 


ignored 


15 0 15 0 

513-312.eps 


Figure 2-4. Segment Registers 

Data Types. Figure 2-5 on page 41 shows the general-purpose data types. They are all scalar, integer 
data types. The 64-bit (quadword) data types are only available in 64-bit mode, and for most 
instructions they require a REX instruction prefix. 
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Signed Integer 


s 16 bytes (64-bit mode only) 


» 8 bytes (64-bit mode only) 

63 . 

» 4 bytes 


31 • 

; 2 bytes 



l 

5 ! 

u 


7 0 


Double 

Quadword 

Quadword 

Doubleword 

Word 

Byte 


Unsigned Integer 



Figure 2-5. General-Purpose Data Types 
2.3.2 System Instructions 

Registers. The system instructions use several specialized registers shown in Figure 2-6 on page 42. 
System software uses these registers to, among other things, manage the processor’s operating 
environment, define system resource characteristics, and monitor software execution. With the 
exception of the RFLAGS register, system registers can be read and written only from privileged 
software. 

All system registers are 64 bits wide, except for the descriptor-table registers and the task register, 
which include 64-bit base-address fields and other fields. 
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System-Flags Register 
RFLAGS 



Descriptor-Table Registers 
GDTR 
IDTR 
LDTR 


Task Register 


Extended-Feature-Enable Register Memory-Typing Registers 

EFER 


System-Configuration Register 
SYSCFG 


Performance-Monitoring Registers 


Machine-Check Registers 


Debug-Extension Registers 


Model-Specific Registers 


DebugCtl 

LastBranchFromlP 

LastBranchTolP 

LastlntFromlP 

LastlntTolP 


MCG_CAP 
MCGSTAT 
MCGCTL 
MCi_CTL 
MCiSTATUS 
MCi_ADDR 
MCi MISC 


TSC 

PerfEvtSeln 

PerfCtrn 


System-Linkage Registers 
STAR 
LSTAR 
CSTAR 
SFMASK 
FS.base 
GS.base 
KerneIGSbase 
SYSENTERCS 
SYSENTERESP 
SYSENTER EIP 


MTRRcap 

MTRRdefType 

MTRRphysBasen 

MTRRphysMaskn 

MTRRfixn 

PAT 

TOP_MEM 
TOP MEM2 
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Figure 2-6. System Registers 

Data Structures. Figure 2-7 on page 43 shows the system data structures. These are created and 
maintained by system software for use in protected mode. A processor running in protected mode uses 
these data structures to manage memory and protection, and to store program-state information when 
an interrupt or task switch occurs. 
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Figure 2-7. System Data Structures 


2.3.3 SSE Instructions 

Registers. The SSE instructions operate primarily on 128-bit and 256-bit floating-point vector 
operands located in the 256-bit YMM/XMM registers. Each 128-bit XMM register is defined as the 
lower octword of the corresponding YMM register. The number of available YMM/XMM data 
registers depends on the operating mode, as shown in Figure 2-8 below. In legacy and compatibility 
modes, eight YMM/XMM registers (YMM/XMMO-7) are available. In 64-bit mode, eight additional 
YMM/XMM data registers (YMM/XMM8-15) are available. These eight additional registers are 
addressed via the encoding extensions provided by the REX, VEX, and XOP prefixes. 
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The MXCSR register contains floating-point and other control and status flags used by the 128-bit 
media instructions. Some 128-bit media instructions also use the GPR (Figure 2-2 and Figure 2-3) and 
the MMX registers (Figure 2-12 on page 48) or set or clear flags in the rFLAGS register (see 
Figure 2-2 and Figure 2-3). 
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Figure 2-8. SSE Registers 

Data Types. The SSE instruction set architecture provides support for 128-bit and 256-bit packed 
floating-point and integer data types as well as integer and floating-point scalars. Figure 2-9 below 
shows the 128-bit data types. Figure 2-10 on page 46 and Figure 2-11 on page 47 show the 256-bit 
data types. The floating-point data types include IEEE-754 single precision and double precision 
types. 
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Vector (Packed) Floating-Point - Double Precision and Single Precision 
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Figure 2-9. 128-Bit SSE Data Types 
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Figure 2-10. SSE 256-bit Data Types 
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Vector (Packed) Unsigned Integer - Double Quadword, Quadword, Doubleword, Word, Byte 
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Note: 1) A 16 bit Half-Precision Floating-Point Scalar is also defined. 

Figure 2-11. SSE 256-Bit Data Types (Continued) 
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2.3.4 64-Bit Media instructions 

Registers. The 64-bit media instructions use the eight 64-bit MMX registers, as shown in 
Figure 2-12. These registers are mapped onto the x87 floating-point registers, and 64-bit media 
instructions write the x87 tag word in a way that prevents an x87 instruction from using MMX data. 

Some 64-bit media instructions also use the GPR (Figure 2-2 and Figure 2-3) and the XMM registers 
(Figure 2-8). 


MMX Data Registers 
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Figure 2-12. 64-Bit Media Registers 

Data Types. Figure 2-13 on page 49 shows the 64-bit media data types. They include floating-point 
and integer vectors and integer scalars. The floating-point data type, used by 3DNow! instructions, 
consists of a packed vector or two IEEE-754 32-bit single-precision data types. Unlike other kinds of 
floating-point instructions, however, the 3DNow!™ instructions do not generate floating-point 
exceptions. For this reason, there is no register for reporting or controlling the status of exceptions in 
the 64-bit-media instruction subset. 


48 


Instruction Overview 




24594 — Rev. 3.28—September 2019 


Vector (Packed) Single-Precision Floating-Point 


s exp 

significand 


; exp 

significand 

63 

54 

31 

22 0 


Vector (Packed) Signed Integers 


5 doubleword 

doubleword 

s word 

word 

word 

word 

s byte 

byte 

byte 

byte 

byte 

byte 

byte 

byte 


63 55 47 39 31 23 15 7 0 

Vector (Packed) Unsigned Integers 


doubleword 

doubleword 

word 

word 

word 

word 

byte 

byte 

byte 

byte 

byte 

byte 

byte 

byte 


63 55 47 39 31 23 15 7 0 

Signed Integers 


s quadword 

63 ! 

» doubleword 

31 . 

; word 


l 

5 ! 

; byte 


7 0 

Unsigned Integers 


quadword 

63 

doubleword 

31 

word 



15 

byte 

7 


513-319.eps 0 

Figure 2-13. 64-Bit Media Data Types 
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2.3.5 x87 Floating-Point Instructions 

Registers. The x87 floating-point instructions use the x87 registers shown in Figure 2-14. There are 
eight 80-bit data registers, three 16-bit registers that hold the x87 control word, status word, and tag 
word, and three registers (last instruction pointer, last opcode, last data pointer) that hold information 
about the last x87 operation. 

The physical data registers are named FPR0-FPR7, although x87 software references these registers 
as a stack of registers, named ST(0)-ST(7). The x87 instructions store operands only in their own 80- 
bit floating-point registers or in memory. They do not access the GPR or XMM registers. 
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Figure 2-14. x87 Registers 

Data Types. Figure 2-15 on page 51 shows all x87 data types. They include three floating-point 
fonnats (80-bit double-extended precision, 64-bit double precision, and 32-bit single precision), three 
signed-integer formats (quadword, doubleword, and word), and an 80-bit packed binary-coded 
decimal (BCD) fonnat. 
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Figure 2-15. x87 Data Types 

2.4 Summary of Exceptions 

Table 2-1 on page 52 lists all possible exceptions. The table shows the interrupt-vector numbers, 
names, mnemonics, source, and possible causes. Exceptions that apply to specific instructions are 
documented with each instruction in the instruction-detail pages that follow. 
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Table 2-1. Interrupt-Vector Source and Cause 


Vector 

Interrupt (Exception) 

Mnemonic 

Source 

Cause 

0 

Divide-By-Zero-Error 

#DE 

Software 

DIV, IDIV, AAM instructions 

1 

Debug 

#DB 

Internal 

Instruction accesses and data accesses 

2 

Non-Maskable-Interrupt 

#NMI 

External 

External NMI signal 

3 

Breakpoint 

#BP 

Software 

INT3 instruction 

4 

Overflow 

#OF 

Software 

INTO instruction 

5 

Bound-Range 

#BR 

Software 

BOUND instruction 

6 

Invalid-Opcode 

#UD 

Internal 

Invalid instructions 

7 

Device-Not-Available 

#NM 

Internal 

x87 instructions 

8 

Double-Fault 

#DF 

Internal 

Interrupt during an interrupt 

9 

Coprocessor-Segment-Overrun 

— 

External 

Unsupported (reserved) 

10 

Invalid-TSS 

#TS 

Internal 

Task-state segment access and task 
switch 

11 

Segment-Not-Present 

#NP 

Internal 

Segment access through a descriptor 

12 

Stack 

#SS 

Internal 

SS register loads and stack references 

13 

General-Protection 

#GP 

Internal 

Memory accesses and protection 
checks 

14 

Page-Fault 

#PF 

Internal 

Memory accesses when paging 
enabled 

15 

Reserved 

— 

16 

Floating-Point Exception- 
Pending 

#MF 

Software 

x87 floating-point and 64-bit media 
floating-point instructions 

17 

Alignment-Check 

#AC 

Internal 

Memory accesses 

18 

Machine-Check 

#MC 

Internal 

External 

Model specific 

19 

SIMD Floating-Point 

#XF 

Internal 

128-bit media floating-point instructions 

20—29 

Reserved (Internal and External) 

— 

30 

SVM Security Exception 

#sx 

External 

Security-Sensitive Events 

31 

Reserved (Internal and External) 

— 

0—255 

External Interrupts (Maskable) 

#INTR 

External 

External interrupt signal 

0—255 

Software Interrupts 

— 

Software 

INTn instruction 


2.5 Notation 

2.5.1 Mnemonic Syntax 

Each instruction has a syntax that includes the mnemonic and any operands that the instruction can 
take. Figure 2-16 shows an example of a syntax in which the instruction takes two operands. In most 
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instructions that take two operands, the first (left-most) operand is both a source operand (the first 
source operand) and the destination operand. The second (right-most) operand serves only as a source, 
not a destination. 


ADDPD xmml, xmm2/meml28 

Mnemonic - 

First Source Operand 
and Destination Operand 


Second Source Operand 


513-322.eps 


Figure 2-16. Syntax for Typical Two-Operand Instruction 

The following notation is used to denote the size and type of source and destination operands: 

• cReg —Control register. 

• dReg —Debug register. 

• imm8 —Byte (8-bit) immediate. 

• imml 6 —Word (16-bit) immediate. 

• imml6/32 —Word (16-bit) or doubleword (32-bit) immediate. 

• imm32 —Doubleword (32-bit) immediate. 

• imm32/64 —Doubleword (32-bit) or quadword (64-bit) immediate. 

• imm64 —Quadword (64-bit) immediate. 

• mem —An operand of unspecified size in memory. 

• mem8 —Byte (8-bit) operand in memory. 

• mem 16 —Word (16-bit) operand in memory. 

• meml 6/32 —Word (16-bit) or doubleword (32-bit) operand in memory. 

• mem32 —Doubleword (32-bit) operand in memory. 

• mem32/48 —Doubleword (32-bit) or 48-bit operand in memory. 

• mem48 —48-bit operand in memory. 

• mem.64 —Quadword (64-bit) operand in memory. 

• meml28 —Double quadword (128-bit) operand in memory. 

• meml 6:16 —Two sequential word (16-bit) operands in memory. 

• meml 6:32 —A doubleword (32-bit) operand followed by a word (16-bit) operand in memory. 

• mem32real —Single-precision (32-bit) floating-point operand in memory. 
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• meml6int —Word (16-bit) integer operand in memory. 

• mem32int —Doubleword (32-bit) integer operand in memory. 

• mem64real —Double-precision (64-bit) floating-point operand in memory. 

• mem64int —Quadword (64-bit) integer operand in memory. 

• mem80real —Double-extended-precision (80-bit) floating-point operand in memory. 

• mem80dec —80-bit packed BCD operand in memory, containing 18 4-bit BCD digits. 

• mem2env — 16-bit x87 control word or x87 status word. 

• meml4/28env — 14-byte or 28-byte x87 environment. The x87 environment consists of the x87 
control word, x87 status word, x87 tag word, last non-control instruction pointer, last data pointer, 
and opcode of the last non-control instruction completed. 

• mem.94/108env —94-byte or 108-byte x87 environment and register stack. 

• mem512env —512-byte environment for 128-bit media, 64-bit media, and x87 instructions. 

• mmx —Quadword (64-bit) operand in an MMX register. 

• mmxl —Quadword (64-bit) operand in an MMX register, specified as the left-most (first) operand 
in the instruction syntax. 

• mmx2 —Quadword (64-bit) operand in an MMX register, specified as the right-most (second) 
operand in the instruction syntax. 

• mmx/mem32 —Doubleword (32-bit) operand in an MMX register or memory. 

• mmx/mem64 —Quadword (64-bit) operand in an MMX register or memory. 

• mmxl/mem 64 —Quadword (64-bit) operand in an MMX register or memory, specified as the left¬ 
most (first) operand in the instruction syntax. 

• mmx2/mem64 —Quadword (64-bit) operand in an MMX register or memory, specified as the right¬ 
most (second) operand in the instruction syntax. 

• moffset —Direct memory offset that specifies an operand in memory. 

• moffset8 —Direct memory offset that specifies a byte (8-bit) operand in memory. 

• moffset 16 —Direct memory offset that specifies a word (16-bit) operand in memory. 

• moffset32 —Direct memory offset that specifies a doubleword (32-bit) operand in memory. 

• moffset64 —Direct memory offset that specifies a quadword (64-bit) operand in memory. 

• pntrl6:16 —Far pointer with 16-bit selector and 16-bit offset. 

• pntrl6:32 —Far pointer with 16-bit selector and 32-bit offset. 

• reg —Operand of unspecified size in a GPR register. 

• reg8 —Byte (8-bit) operand in a GPR register. 

• regl 6 —Word (16-bit) operand in a GPR register. 

• regl6/32 —Word (16-bit) or doubleword (32-bit) operand in a GPR register. 

• reg32 —Doubleword (32-bit) operand in a GPR register. 

• reg64 —Quadword (64-bit) operand in a GPR register. 
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• reg/mem8 —Byte (8-bit) operand in a GPR register or memory. 

• reg/mem 16 —Word (16-bit) operand in a GPR register or memory. 

• reg/mem32 —Doubleword (32-bit) operand in a GPR register or memory. 

• reg/mem64 —Quadword (64-bit) operand in a GPR register or memory. 

• rel8off- —Signed 8-bit offset relative to the instruction pointer. 

• rell6off- —Signed 16-bit offset relative to the instruction pointer. 

• rel32off- —Signed 32-bit offset relative to the instruction pointer. 

• segReg or sReg —Word (16-bit) operand in a segment register. 

• ST(0) —x87 stack register 0. 

• ST(i) —x87 stack register where i is between 0 and 7. 

• xmm —Double quadword (128-bit) operand in an XMM register. 

• xmml —Double quadword (128-bit) operand in an XMM register, specified as the left-most (first) 
operand in the instruction syntax. 

• xmm2 —Double quadword (128-bit) operand in an XMM register, specified as the right-most 
(second) operand in the instruction syntax. 

• xmm/mem64 —Quadword (64-bit) operand in a 128-bit XMM register or memory. 

• xmm/meml28 —Double quadword (128-bit) operand in an XMM register or memory. 

• xmm.l/mem.128 —Double quadword (128-bit) operand in an XMM register or memory, specified as 
the left-most (first) operand in the instruction syntax. 

• xmm2/meml28 —Double quadword (128-bit) operand in an XMM register or memory, specified as 
the right-most (second) operand in the instruction syntax. 

• ymm —Double octword (256-bit) operand in an YMM register. 

• ymml —Double octword (256-bit) operand in an YMM register, specified as the left-most (first) 
operand in the instruction syntax. 

• ymm.2 —Double octword (256-bit) operand in an YMM register, specified as the right-most 
(second) operand in the instruction syntax. 

• ymm/mem.64 —Quadword (64-bit) operand in a 256-bit YMM register or memory. 

• ymm/meml28 —Double quadword (128-bit) operand in an YMM register or memory. 

• ymml/mem256 —Double octword (256-bit) operand in an YMM register or memory, specified as 
the left-most (first) operand in the instruction syntax. 

• ymm2/mem256 —Double octword (256-bit) operand in an YMM register or memory, specified as 
the right-most (second) operand in the instruction syntax. 

2.5.2 Opcode Syntax 

In addition to the notation shown above in “Mnemonic Syntax” on page 52, the following notation 

indicates the size and type of operands in the syntax of an instruction opcode: 
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• /digit —Indicates that the ModRM byte specifies only one register or memory (r/m) operand. The 
digit is specified by the ModRM reg field and is used as an instruction-opcode extension. Valid 
digit values range from 0 to 7. 

• /r —Indicates that the ModRM byte specifies both a register operand and a reg/mem (register or 
memory) operand. 

• cb, cw, cd, cp —Specifies a code-offset value and possibly a new code-segment register value. The 
value following the opcode is either one byte (cb), two bytes (cw), four bytes (cd), or six bytes 
(cp). 

• ib, iw, id, iq —Specifies an immediate-operand value. The opcode determines whether the value is 
signed or unsigned. The value following the opcode, ModRM, or SIB byte is either one byte (ib), 
two bytes (iw), or four bytes (id). Word and doubleword values start with the low-order byte. 

• +rb, +rw, +rd, +rq —Specifies a register value that is added to the hexadecimal byte on the left, 
fonning a one-byte opcode. The result is an instruction that operates on the register specified by 
the register code. Valid register-code values are shown in Table 2-2. 

• m64 —Specifies a quadword (64-bit) operand in memory. 

• +/-—Specifies an x87 floating-point stack operand, ST(/). The value is used only with x87 floating¬ 
point instructions. It is added to the hexadecimal byte on the left, forming a one-byte opcode. Valid 
values range from 0 to 7. 


Table 2-2. +rb, +rw, +rd, and +rq Register Value 


REX.B 

Bit 1 

Value 

Specified Register 

+rb 

+rw 

+rd 

+rq 

0 

or no REX 
Prefix 

0 

AL 

AX 

EAX 

RAX 

1 

CL 

CX 

ECX 

RCX 

2 

DL 

DX 

EDX 

RDX 

3 

BL 

BX 

EBX 

RBX 

4 

AH, SPL' 1 

SP 

ESP 

RSP 

5 

CH, BPL 1 

BP 

EBP 

RBP 

6 

DH, SIL 1 

SI 

ESI 

RSI 

7 

BH, DIL 1 

Dl 

EDI 

RDI 

1 

0 

R8B 

R8W 

R8D 

R8 

1 

R9B 

R9W 

R9D 

R9 

2 

R10B 

R10W 

R10D 

RIO 

3 

RUB 

R11W 

R11D 

R11 

4 

R12B 

R12W 

R12D 

R12 

5 

R13B 

R13W 

R13D 

R13 

6 

R14B 

R14W 

R14D 

R14 

7 

R15B 

R15W 

R15D 

R15 

1. See “REX Prefix” on page 14. 
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2.5.3 Pseudocode Definition 

Pseudocode examples are given for the actions of several complex instructions (for example, see 
“CALL (Near)” on page 126). The following definitions apply to all such pseudocode examples: 

///////////////////////////////////////////////////////////////////////////////// 

// Pseudo Code Definition 

///////////////////////////////////////////////////////////////////////////////// 

// 

// Comments start with double slashes. 

// 

// '=' can mean "is", or assignment based on context 
// '==' is the equals comparison operator 

// 

///////////////////////////////////////////////////////////////////////////////// 

// Constants 

///////////////////////////////////////////////////////////////////////////////// 

0 // numbers are in base-10 (decimal), unless followed by a suffix 

0000__0001b // a number in binary notation, underbars added for readability 

FFE0_0000h // a number expressed in hexadecimal notation 

// in the following, '&&' is the logical AND operator. See "Logical Operators" 

// below. 

// reg[fld] identifies a field (one or more bits) within architected register 
// or within a sub-element of a larger data structure. A dot separates the 
// higher-level data structure name from the sub-element name. 

// 

CS.desc = Code Segment descriptor // CS.desc has sub-elements: base, limit, attr 
SS.desc = Stack Segment descriptor // SS.desc has the same sub-elements 
CS.desc.base = base subfield of CS.desc 
CS = Code Segment Register 
SS = Stack Segment Register 

CPL = Current Privilege Level (0 <= CPL <= 3) 

REAL_MODE = (CR0[PE] == 0) 

PROTECTED_MODE = ((CR0[PE] == 1) && (RFLAGS[VM] == 0)) 

VIRTUAL_MODE = ((CR0[PE] == 1) && (RFLAGS[VM] == 1)) 

LEGACY JdODE = (EFER[LMA] == 0) 

LONG_MODE = (EFER[LMA] == 1) 

64BIT_MODE = ((EFER[LMA]==1) && (CS_desc.attr[L] == 1) && (CS^desc.attr[D] == 0)) 
COMPATIBILITY_MODE = (EFER[LMA] == 1) && (CS^desc.attr[L] == 0) 

PAGING_ENABLED = (CR0[PG] == 1) 

ALIGNMENT_CHECK_ENABLED = ((CR0[AM] == 1) && (RFLAGS[AC] == 1) && (CPL == 3)) 

OPERAND_SIZE = 16, 32, or 64 // size, in bits, of an operand 

// OPERAND_SIZE depends on processor mode, the current code segment descriptor 
// default operand size [D], presence of the operand size override prefix (66h) 

// and, in 64-bit mode, the REX prefix. 

// NOTE: Specific instructions take 8-bit operands, but for these instructions, 

// operand size is fixed and the variable OPERAND_SIZE is not needed. 

ADDRESS SIZE = 16, 32, or 64 // size, in bits, of the effective address for 
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// memory reads. ADDRESS_SIZE depends processor mode, the current code segment 
// descriptor default operand size [D], and the presence of the address size 
// override prefix (67h) 

STACK_SIZE = 16, 32, or 64 // size, in bits of stack operation operand 

// STACK_SIZE depends on current code segment descriptor attribute D bit and 
// the Stack Segment descriptor attribute B bit. 


///////////////////////////////////////////////////////////////////////////////// 

// Architected Registers 

///////////////////////////////////////////////////////////////////////////////// 
// Identified using abbreviated names assigned by the Architecture; can represent 
// the register or its contents depending on context. 

RAX = the 64-bit contents of the general-purpose register 
EAX = 32-bit contents of GPR EAX 
AX = 16-bit contents of GPR AX 
AL = lower 8 bits of GPR AX 
AH = upper 8 bits of GPR AX 

index of(reg) = value used to encode the register, 
index of(AX) = 0000b 
index of(RAX) = 0000b 

// in legacy and compatibility modes the msb of the index is fixed as 0 


///////////////////////////////////////////////////////////////////////////////// 

// Defined Variables 

///////////////////////////////////////////////////////////////////////////////// 


old_ 

old_ 

old_ 

old_ 

old_ 

old_ 

old_ 

old_ 

old 


RIP = RIP at the 
RSP = RSP at the 
RFLAGS = RFLAGS 


CS 

DS 

ES 

FS 

GS 

ss 


CS selector 
DS selector 
ES selector 
FS selector 
GS selector 
SS selector 


start of current instruction 
start of current instruction 
at the start of the instruction 
at the start of current instruction 
at the start of current instruction 
at the start of current instruction 
at the start of current instruction 
at the start of current instruction 
at the start of current instruction 


RIP = the current RIP register 

RSP = the current RSP register 

RBP = the current RBP register 

RFLAGS = the current RFLAGS register 

next RIP = RIP at start of next instruction 


CS.desc = the current CS descriptor, 
base limit attr 

SS.desc = the current SS descriptor, 
base limit attr 


including the subfields: 
including the subfields: 
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SRC = the instruction's source operand 
SRC1 = the instruction's first source operand 
SRC2 = the instruction's second source operand 
SRC3 = the instruction's third source operand 
IMM8 = 8-bit immediate encoded in the instruction 
IMM16 = 16-bit immediate encoded in the instruction 

IMM32 = 32-bit immediate encoded in the instruction 

IMM64 = 64-bit immediate encoded in the instruction 

DEST = instruction's destination register 

temp^* // 64-bit temporary register 

temp^*_desc // temporary descriptor, with sub-elements: 

// if it points to a block of memory: base limit attr 
// if it's a gate descriptor: offet segment attr 

NULL = OOOOh // null selector is all zeros 

///////////////////////////////////////////////////////////////////////////////// 

// Exceptions 

///////////////////////////////////////////////////////////////////////////////// 
EXCEPTION [#GP(0)] // Signals an exception; error code in parenthesis 
EXCEPTION [#UD] // if no error code 

// possible exception types: 

#DE // Divide-By-Zero-Error Exception (Vector 0) 

#DB // Debug Exception (Vector 1) 

#BP // INT3 Breakpoint Exception (Vector 3) 

#OF // INTO Overflow Exception (Vector 4) 

#BR // Bound-Range Exception (Vector 5) 

#UD // Invalid-Opcode Exception (Vector 6) 

#NM // Device-Not-Available Exception (Vector 7) 

#DF // Double-Fault Exception (Vector 8) 

#TS // Invalid-TSS Exception (Vector 10) 

#NP // Segment-Not-Present Exception (Vector 11) 

#SS // Stack Exception (Vector 12) 

#GP // General-Protection Exception (Vector 13) 

#PF // Page-Fault Exception (Vector 14) 

#MF // x87 Floating-Point Exception-Pending (Vector 16) 

#AC // Alignment-Check Exception (Vector 17) 

#MC // Machine-Check Exception (Vector 18) 

#XF // SIMD Floating-Point Exception (Vector 19) 



// V,Z,A,S are integer variables, assigned a value when an instruction begins 
// executing (they can be assigned a different value in the middle of an 
// instruction, if needed) 

IF (OPERAND SIZE == 16) V = 2 
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IF (OPERAND_SIZE ==32) V = 4 

IF (OPERAND_SIZE ==64) V = 8 

IF (OPERAND_SIZE == 16) Z = 2 

IF (OPERAND_SIZE ==32) Z = 4 

IF (OPERAND_SIZE ==64) Z = 4 

IF (ADDRESS_SIZE == 16) A = 2 

IF (ADDRESS_SIZE == 32) A = 4 

IF (ADDRESS_SIZE == 64) A = 8 

IF (STACK_SIZE == 16) S = 2 

IF (STACK_SIZE == 32) S = 4 

IF (STACK_SIZE == 64) S = 8 

///////////////////////////////////////////////////////////////////////////////// 

// Bit Range Inside a Register 

///////////////////////////////////////////////////////////////////////////////// 

temp data[x:y] // Bits x through y (inclusive) of temp data 

///////////////////////////////////////////////////////////////////////////////// 

// Variables and data types 

///////////////////////////////////////////////////////////////////////////////// 

NxtValue = 5 //default data type is unsigned int. 

int //abstract data type representing an integer 

bool //abstract data type; either TRUE or FALSE 

vector //An array of data elements. Individual elements are accessed via 

//an unsigned integer zero-based index. Elements have a data type, 
bit //a single bit 

byte //8-bit value 

word //16-bit value 

doubleword //32-bit value 

quadword //64-bit value 

octword //128-bit value 

double octword //256-bit value 

unsigned int aval //treat aval as an unsigned integer value 
signed int valx //treat valx as a signed integer value 

bit vector b_vect //b_vect is an array of data elements. Each element is a bit. 
b vect[5] //The sixth element (bit) in the array. Indices are 0-based. 

///////////////////////////////////////////////////////////////////////////////// 

// Elements Within a packed data type 

///////////////////////////////////////////////////////////////////////////////// 

// element i of size w occupies bits [wi-l:wi] 

///////////////////////////////////////////////////////////////////////////////// 
// Moving Data From One Register To Another 
///////////////////////////////////////////////////////////////////////////////// 
temp_dest.b = temp_src; // 1-byte move (copies lower 8 bits of temp_src to 

// temp_dest, preserving the upper 56 bits of temp_dest) 
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temp dest.w = temp src; // 2-byte move (copies lower 16 bits of temp src to 

// temp_dest, preserving the upper 48 bits of temp_dest) 
temp_dest.d = temp_src; // 4-byte move (copies lower 32 bits of temp_src to 

// temp_dest; zeros out the upper 32 bits of temp_dest) 
temp_dest.q = temp_src; // 8-byte move (copies all 64 bits of temp^src to 

// temp_dest) 

temp dest.v = temp src; // 2-byte move if V==2 

// 4-byte move if V==4 

// 8-byte move if V==8 

temp dest.z = temp src; // 2-byte move if Z==2 

// 4-byte move if Z==4 

temp dest.a = temp src; // 2-byte move if A==2 

// 4-byte move if A==4 

// 8-byte move if A==8 

temp_dest.s = temp_src; // 2-byte move if S==2 

// 4-byte move if S==4 

// 8-byte move if S==8 

///////////////////////////////////////////////////////////////////////////////// 

// Arithmetic Operators 

///////////////////////////////////////////////////////////////////////////////// 

a + b // integer addition 

a - b // integer subtraction 

a * b // integer multiplication 

a / b // integer division. Result is the quotient 

a % b // modulo. Result is the remainder after a is divided by b 

// multiplication has precedence over addition where precedence is not explicitly 
// indicated by grouping terms with parentheses 

///////////////////////////////////////////////////////////////////////////////// 

// Bitwise Operators 

///////////////////////////////////////////////////////////////////////////////// 

// temp, a, and b are values or register contents of the same size 

temp = a AND b; // Corresponding bits of a and b are logically ANDed together 

temp = a OR b; // Corresponding bits of a and b are logically ORed together 

temp = a XOR b; // Each bit of temp is the exclusive OR of the corresponding 

// bits of a and b 

temp = NOT a; // Each bit of temp is the complement of the corresponding 

// bit of a 

// Concatenation 

value = {fieldl,field2,100b}; //pack values of fieldl, field2 and 100b 
size^of(value) = (size_of(fieldl) + size_of(field2) + 3) 

///////////////////////////////////////////////////////////////////////////////// 

// Logical Shift Operators 

///////////////////////////////////////////////////////////////////////////////// 

temp = a << b; // Result is a shifted left by b bit positions. Zeros are 

// shifted into vacant positions. Bits shifted out are lost, 
temp = a >> b; // Result is a shifted right by b bit positions. Zeros are 

// shifted into vacant positions. Bits shifted out are lost. 
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// Logical Operators 

///////////////////////////////////////////////////////////////////////////////// 

// a boolean variable can assume one of two values (TRUE or FALSE) 

// In these examples, F00, BAR, CONE, and HEAD have been defined to be boolean 
// variables 

F00 && BAR // Logical AND 

F00 || BAR // Logical OR 

!F00 // Logical complement (NOT) 

///////////////////////////////////////////////////////////////////////////////// 

// Comparison Operators 

///////////////////////////////////////////////////////////////////////////////// 

// a and b are integer values. The result is a boolean value. 

a == b // if a and b are equal, the result is TRUE; otherwise it is FALSE, 

a != b // if a and b are not equal, the result is TRUE; otherwise it is FALSE, 

a > b // if a is greater than b, the result is TRUE; otherwise it is FALSE, 

a < b // if a is less than b, the result is TRUE; otherwise it is FALSE, 

a >= b // if a is greater than or equal to b, the result is TRUE; otherwise 

// it is FALSE. 

a <= b // if a is less than or equal to b, the result is TRUE; otherwise 

// it is FALSE. 

///////////////////////////////////////////////////////////////////////////////// 

// Logical Expressions 

///////////////////////////////////////////////////////////////////////////////// 
// Logical binary (two operand) and unary (one operand) operators can be combined 
// with comparison operators to form more complex expressions. Parentheses are 
// used to enclose comparison terms and to show precedence. If precedence is not 
// explicitly shown, logical AND has precedence over logical OR. Unary operators 
// have precedence over binary operators. 

F00 && (a < b) || !BAR // evaluate the comparison a < b first, then 

// AND this with F00. Finally OR this intermediate result 
// with the complement of BAR. 

// Logical expressions can be English phrases that can be evaluated to be TRUE 
// or FALSE. Statements assume knowledge of the system architecture (Volumes 1 and 
// 2 ) . 

///////////////////////////////////////////////////////////////////////////////// 

IF (it is raining) 
close the window 

///////////////////////////////////////////////////////////////////////////////// 

// Assignment Operators 

///////////////////////////////////////////////////////////////////////////////// 
a = a + b // The value a is assigned the sum of the values a and b 
// 

temp = R1 // The contents of the register temp is replaced by a copy of the 
// contents of register R1. 
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RO += 2 // RO is assigned the sum of the contents of RO and the integer 2. 

// 

R5 |= R6 // R5 is assigned the result of the bit-wise OR of the contents of R5 
// and R6. Contents of R6 is unchanged. 

R4 &= R7 // R4 is assigned the result of the bit-wise AND of the contents of 
// R4 and R7. Contents of R7 is unchanged. 
///////////////////////////////////////////////////////////////////////////////// 
// IF-THEN-ELSE 

///////////////////////////////////////////////////////////////////////////////// 

IF (F00) <expression> // evaluation of <expression> is dependent on F00 

// being TRUE. If F00 is FALSE, <expression> is not 
// evaluated. 


IF (F00) 

<dependent expressionl> 


// scope of IF is indicated by indentation 


<dependent expressionx> 


IF (F00) 

<dependent expression> 

ELSIF (BAR) 

<alt expression> 

ELSE 

<default expressions> 


// If F00 is TRUE, <dependent expression> is 
// evaluated and the remaining ELSEIF and ELSE 
// clauses are skipped. 

// 

// IF F00 is FALSE and BAR is TRUE, <alt expression> 
// is evaluated and the subsequent ELSEIF or ELSE 
// clauses are skipped. 

// evaluated if all the preceeding IF and ELSEIF 
// conditions are FALSE. 


IF ((F00 && BAR) || (CONE && HEAD)) // The condition can be an expression. 
<dependent expressions> 


///////////////////////////////////////////////////////////////////////////////// 

// Loops 

///////////////////////////////////////////////////////////////////////////////// 

FOR i = <init val> to <final_val>, BY <step> 

<expression> // scope of loop is indicated by indentation 

// if <step> = 1, may omit "BY" clause 


// nested loop example 
temp = 0 
FOR i = 0 to 7 
temp += 1 

For j = 0 to 7, BY 2 
<inner-most exp> 
<next expression outside 


//initialize temp 

// i takes on the values 0 through 7 in succession 
// In the outer loop. Evaluated a total of 8 times. 
// j takes on the values 0, 2, 4, and 6; but not 7. 
// This will be evaluated a total of 8 * 4 times, 
both loops> 
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// C Language form of loop syntax is also allowed 

FOR (i =0; i < MAX; i++) 

{ 

<expressions> //evaluated MAX times 

} 

///////////////////////////////////////////////////////////////////////////////// 

// Functions 

///////////////////////////////////////////////////////////////////////////////// 

// Syntax for function definition 

<return data type> <function name>(argument,..) 

<expressions> 

RETURN <result> 

///////////////////////////////////////////////////////////////////////////////// 

// Built-in Functions 

///////////////////////////////////////////////////////////////////////////////// 
SignExtend(arg) // returns value of arg sign extended to the width of the data 
// type of the function. Data type of function is inferred from 
// the context of the function's invocation. 


ZeroExtend(arg) // returns value of arg zero extended to the width of the data 
// type of the function. Data type of function is inferred from 
// the context of the function's invocation. 

indexof(reg) //returns binary value used to encode reg specification 

///////////////////////////////////////////////////////////////////////////////// 

// READ_MEM 

// General memory read. This zero-extends the data to 64 bits and returns it. 

///////////////////////////////////////////////////////////////////////////////// 

usage: 

temp = READ MEM.x [segroffset] // where x is one of {v, z, b, w, d, q} 

// and denotes the size of the memory read 

definition: 

IF ((seg AND OxFFFC) == NULL) // GP fault for using a null segment to 

// reference memory 

EXCEPTION [#GP(0)] 

IF ((seg==CS) || (seg==DS) || (seg==ES) || (seg==FS) || (seg==GS)) 

// CS,DS,ES,FS,GS check for segment limit or canonical 
IF ( ( !64BIT MODE) && (offset is outside seg's limit)) 

EXCEPTION [#GP(0)] 

// #GP fault for segment limit violation in non-64-bit mode 
IF ((64BIT MODE) && (offset is non-canonical)) 

EXCEPTION [#GP(0 ) ] 

// #GP fault for non-canonical address in 64-bit mode 
ELSIF (seg==SS) // SS checks for segment limit or canonical 
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IF ((!64BIT MODE) && (offset is outside seg's limit)) 

EXCEPTION [#SS(0)] 

// stack fault for segment limit violation in non-64-bit mode 
IF ((64BIT MODE) && (offset is non-canonical)) 

EXCEPTION [#SS(0)] 

// stack fault for non-canonical address in 64-bit mode 
ELSE // ((seg==GDT) || (seg==LDT) || (seg==IDT) || (seg==TSS)) 

// GDT,LDT,IDT,TSS check for segment limit and canonical 
IF (offset > seg.limit) 

EXCEPTION [#GP(0)] // #GP fault for segment limit violation 

// in all modes 

IF ((LONG_MODE) && (offset is non-canonical)) 

EXCEPTION [#GP(0)] // #GP fault for non-canonical address in long mode 


IF ((ALIGNMENT_CHECK_ENABLED) 
EXCEPTION [#AC(0)] 


&& (offset misaligned, considering its 
size and alignment)) 


IF ((64_bit_mode) && ((seg==CS) || (seg==DS) || (seg==ES) || (seg==SS)) 
temp_linear = offset 

ELSE 

temp_linear = seg.base + offset 


IF ((PAGING_ENABLED) && (virtual-to-physical translation for temp^linear 

results in a page-protection violation)) 

EXCEPTION [#PF(error_code)] // page fault for page-protection violation 

// (U/S violation. Reserved bit violation) 


IF ((PAGING_ENABLED) && (temp_linear is on a not-present page)) 

EXCEPTION [#PF(error_code)] // page fault for not-present page 

temp data = memory [temp linear].x // zero-extends the data to 64 

// bits, and saves it in temp data 

RETURN (temp data) // return the zero-extended data 


///////////////////////////////////////////////////////////////////////////////// 

// WRITE MEM // General memory write 

///////////////////////////////////////////////////////////////////////////////// 


usage: 

WRITE MEM.x [segroffset] = temp.x // where <X> is one of these: 

// {V, Z, B, W, D, Q} and denotes the 
// size of the memory write 

definition: 


IF ((seg & OxFFFC)== NULL) 
EXCEPTION [#GP(0)] 


// GP fault for using a null segment 
// to reference memory 


Instruction Overview 


65 



AMpg 

AMD64 Technology 


24594 — Rev. 3.28—September 2019 


IF (seg isn't writable) // GP fault for writing to a read-only segment 
EXCEPTION [#GP(0)] 

IF ((seg==CS) || (seg==DS) || (seg==ES) || (seg==FS) || (seg==GS)) 

// CS,DS,ES,FS,GS check for segment limit or canonical 
IF ((!64BIT MODE) && (offset is outside seg's limit)) 

EXCEPTION [#GP(0)] 

// #GP fault for segment limit violation in non-64-bit mode 
IF ((64BIT MODE) && (offset is non-canonical)) 

EXCEPTION [#GP(0)] 

// #GP fault for non-canonical address in 64-bit mode 
ELSIF (seg==SS) // SS checks for segment limit or canonical 
IF ((!64BIT MODE) && (offset is outside seg's limit)) 

EXCEPTION [#SS(0)] 

// stack fault for segment limit violation in non-64-bit mode 
IF ((64BIT MODE) && (offset is non-canonical)) 

EXCEPTION [#SS(0)] 

// stack fault for non-canonical address in 64-bit mode 
ELSE // ((seg==GDT) || (seg==LDT) || (seg==IDT) || (seg==TSS)) 

// GDT,LDT,IDT,TSS check for segment limit and canonical 
IF (offset > seg.limit) 

EXCEPTION [#GP(0)] 

// #GP fault for segment limit violation in all modes 
IF ((LONG_MODE) && (offset is non-canonical)) 

EXCEPTION [#GP(0)] 

// #GP fault for non-canonical address in long mode 


IF ((ALIGNMENT_CHECK_ENABLED) 
EXCEPTION [#AC(0)] 


&& (offset is misaligned, considering 
its size and alignment)) 


IF ((64_bit_mode) && ((seg==CS) || (seg==DS) || (seg==ES) || (seg==SS)) 
temp_linear = offset 

ELSE 

temp_linear = seg.base + offset 

IF ((PAGING_ENABLED) && (the virtual-to-physical translation for 
temp_linear results in a page-protection violation)) 

{ 

EXCEPTION [#PF(error_code)] 

// page fault for page-protection violation 
// (U/S violation. Reserved bit violation) 

} 


IF ((PAGING^ENABLED) && (temp_linear 
EXCEPTION [#PF(error_code)] 

memory [temp_linear].x = temp.x 


is on a not-present page)) 

// page fault for not-present page 

// write the bytes to memory 
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// PUSH // Write data to the stack 

///////////////////////////////////////////////////////////////////////////////// 

usage: 

PUSH.x temp // where x is one of these: {v, z, b, w, d, q} and 

// denotes the size of the push 

definition: 

WRITE MEM.x [SS:RSP.s - X] = temp.x // write to the stack 

RSP.s = RSP - X // point rsp to the data just written 


///////////////////////////////////////////////////////////////////////////////// 
// POP // Read data from the stack, zero-extend it to 64 bits 
///////////////////////////////////////////////////////////////////////////////// 

usage: 

POP.x temp // where x is one of these: {v, z, b, w, d, q} and 

// denotes the size of the pop 

definition: 

temp = READ MEM.x [SS:RSP.s] // read from the stack 

RSP.s = RSP + X // point rsp above the data just read 


///////////////////////////////////////////////////////////////////////////////// 
// READ DESCRIPTOR // Read 8-byte descriptor from GDT/LDT, return the descriptor 
///////////////////////////////////////////////////////////////////////////////// 

usage: 

temp descriptor = READ DESCRIPTOR (selector, chktype) 

// chktype field is one of the following: 

// cs chk used for far call and far jump 

// clg_chk used when reading CS for far call or far jump through call gate 

// ss chk used when reading SS 

// iret^chk used when reading CS for IRET or RETF 

// intcs_chk used when readin the CS for interrupts and exceptions 

definition: 

temp_offset = selector AND 0xfff8 // upper 13 bits give an offset 

// in the descriptor table 

IF (selector.TI == 0) // read 8 bytes from the gdt, split it into 

// (base,limit,attr) if the type bits 
temp desc = READ MEM.q [gdt:temp_offset] 

// indicate a block of memory, or split 
// it into (segment,offset,attr) 
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// if the type bits indicate 
// a gate, and save the result in temp_desc 

ELSE 

temp desc = READ MEM.q [ldt:temp_offset] 

// read 8 bytes from the ldt, split it into 
// (base,limit,attr) if the type bits 
// indicate a block of memory, or split 
// it into (segment,offset,attr) if the type 
// bits indicate a gate, and save the result 
// in temp_desc 

IF (selector.rpl or temp desc.attr.dpi is illegal for the current mode/cpl) 
EXCEPTION [#GP(selector)] 

IF (temp desc.attr.type is illegal for the current mode/chktype) 

EXCEPTION [#GP(selector) ] 

IF (temp_desc.attr.p==0) 

EXCEPTION [#NP(selector)] 

RETURN (temp desc) 

///////////////////////////////////////////////////////////////////////////////// 
// READ_IDT // Read an 8-byte descriptor from the IDT, return the descriptor 
///////////////////////////////////////////////////////////////////////////////// 

usage: 

temp_idt desc = READ IDT (vector) 

// "vector" is the interrupt vector number 


definition: 

IF (LONG^MODE) // long-mode idt descriptors are 16 bytes long 

temp_offset = vector*16 

ELSE // (LEGACY MODE) legacy-protected-mode idt descriptors are 8 bytes long 
temp_offset = vector*8 

temp desc = READ MEM.q [idt:temp offset] 

// read 8 bytes from the idt, split it into 
// (segment,offset,attr), and save it in temp_desc 

IF (temp desc.attr.dpi is illegal for the current mode/cpl) 

// exception, with error code that indicates this idt gate 
EXCEPTION [#GP(vector*8+2)] 

IF (temp desc.attr.type is illegal for the current mode) 

// exception, with error code that indicates this idt gate 
EXCEPTION [#GP(vector*8+2)] 

IF (temp_desc.attr.p==0) 
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EXCEPTION [#NP(vector*8+2)] 

// segment-not-present exception, with an error code that 
// indicates this idt gate 


RETURN (temp desc) 


///////////////////////////////////////////////////////////////////////////////// 

// READ_INNER_LEVEL_STACK_POINTER 

// Read a new stack pointer (rsp or ssresp) from the tss 

///////////////////////////////////////////////////////////////////////////////// 

usage: 

temp_SS_desc:temp_RSP = READ_INNER_LEVELJ3TACK_P0INTER (new_cpl, ist_index) 

definition: 

IF (LONGJMODE) 

{ 

IF (ist index>0) 

// if IST is selected, read an ISTn stack pointer from the tss 
temp RSP = READ MEM.q [tss:ist_index*8+28] 

ELSE // (ist_index==0) 

// otherwise read an RSPn stack pointer from the tss 
temp RSP = READ MEM.q [tss:new_cpl*8+4] 

temp_SS_desc.sel = NULL + new cpl 

// in long mode, changing to lower cpl sets SS.sel to 
// NULL+new_cpl 

} 

ELSE // (LEGACY_MODE) 

{ 

temp RSP = READ MEM.d [tss:new_cpl*8+4] // read ESPn from the tss 

temp_sel = READ MEM.d [tss:new_cpl*8+8] // read SSn from the tss 

temp_SS_desc = READ DESCRIPTOR (temp_sel, ss_chk) 

} 

return (temp RSP:temp_SS desc) 


///////////////////////////////////////////////////////////////////////////////// 
// READ BIT ARRAY // Read 1 bit from a bit array in memory 
///////////////////////////////////////////////////////////////////////////////// 

usage: 

temp_value = READ BIT ARRAY ([mem], bit number) 
definition: 

temp BYTE = READ MEM.b [mem + (bit number SHR 3)] 

// read the byte containing the bit 
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temp_BIT = temp_BYTE SHR 

return (temp BIT & 0x01) 


(bit number & 7) 

// shift the requested bit position into bit 0 
// return 'O' or ' 1 ' 
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3 General-Purpose Instruction Reference 


This chapter describes the function, mnemonic syntax, opcodes, affected flags, and possible 
exceptions generated by the general-purpose instructions. General-purpose instructions are used in 
basic software execution. Most of these instructions load, store, or operate on data located in the 
general-purpose registers (GPRs), in memory, or in both. The remaining instructions are used to alter 
the sequential flow of the program by branching to other locations within the program, or to entirely 
different programs. With the exception of the MOVD, MOVMSKPD and MOVMSKPS instructions, 
which operate on MMX/XMM registers, the instructions within the category of general-purpose 
instructions do not operate on any other register set. 

Most general-purpose instructions are supported in all hardware implementations of the AMD64 
architecture. However, some instructions in this group are optional and support must be determined by 
testing processor feature flags using the CPUID instruction. These instructions are listed in Table 3-1, 
along with the CPUID function, the register and bit used to test for the presence of the instruction. 


Table 3-1. Instruction Support Indicated by CPUID Feature Bits 


Instruction 

CPUID Function(s) 

Register[Bit] 

Feature Flag 

Bit Manipulation Instructions - 
group 1 

0000_0007h (ECX=0) 

EBX[3] 

BMI1 

Bit Manipulation Instructions - 
group 2 

0000_0007h (ECX=0) 

EBX[8] 

BMI2 

CMPXCHG8B 

0000_0001h, 8000_0001h 

EDX[8] 

CMPXCHG8B 

CMPXCHG16B 

0000_0001h 

ECX[13] 

CMPXCHG16B 

CMOVcc (Conditional Moves) 

0000_0001h, 8000_0001h 

EDX[15] 

CMOV 

CLFLUSH 

0000_0001h 

EDX[19] 

CLFSH 

CRC32 

0000_0001h 

ECX[20] 

SSE42 

LZCNT 

8000_0001h 

ECX[5] 

ABM 

Long Mode and Long Mode 
instructions 

8000_0001h 

EDX[29] 

LM 

MFENCE, LFENCE 

0000_0001h 

EDX[26] 

SSE2 

MOVBE 

0000_0001h 

ECX[22] 

MOVBE 

MOVD 1 

0000_0001h, 8000_0001h 

EDX[23] 

MMX 

0000_0001h 

EDX[26] 

SSE2 

MOVNTI 

0000_0001h 

EDX[26] 

SSE2 

POPCNT 

0000_0001h 

ECX[23] 

POPCNT 

PREFETCH/ 

PREFETCHW 2 

8000_0001h 

ECX[8] 

3DNowPrefetch 

EDX[29] 

LM 

EDX[31] 

3DNow 

RDFSBASE, RDGSBASE 
WRFSBASE, WRGSBASE 

0000_0007h (ECX=0) 

EBX[0] 

FSGSBASE 
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Table 3-1. Instruction Support Indicated by CPUID Feature Bits 


Instruction 

CPUID Function(s) 

Register[Bit] 

Feature Flag 

RDPRU 

8000_0008h 

EBX[4] 

RDPRU 

SFENCE 

0000_0001h 

EDX[25] 

SSE 

Trailing Bit Manipulation 
Instructions 

8000_0001h 

ECX[21] 

TBM 

Notes: 

1. The MOVD variant that moves values to or from MMX registers is part of the MMX subset; the MOVD variant that 
moves data to or from XMM registers is part of the SSE2 subset. 

2. Instruction is supported if any one of the listed feature flags is set. 


For more information on using the CPUID instruction, see the reference page for the CPUID 
instruction on page 160. For a comprehensive list of all instruction support feature flags, see 
Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


The general-purpose instructions can be used in legacy mode or 64-bit long mode. Compilation of 
general-purpose programs for execution in 64-bit long mode offers three primary advantages: access 
to the eight extended, 64-bit general-purpose registers (for a register set consisting of GPR0-GPR15), 
access to the 64-bit virtual address space, and access to the RIP-relative addressing mode. 

For further information about the general-purpose instructions and register resources, see: 

• “General-Purpose Programming” in Volume 1. 

• “Summary of Registers and Data Types” on page 38. 

• “Notation” on page 52. 

• “Instruction Prefixes” on page 5. 

• Appendix B, “General-Purpose Instructions in 64-Bit Mode.” In particular, see “General Rules for 
64-Bit Mode” on page 505. 
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AAA ASCII Adjust After Addition 

Adjusts the value in the AL register to an unpacked BCD value. Use the AAA instruction after using 
the ADD instruction to add two unpacked BCD numbers. 

The instruction is coded without explicit operands: 

AAA 

If the value in the lower nibble of AL is greater than 9 or the AF flag is set to 1, the instruction 
increments the AH register, adds 6 to the AL register, and sets the CF and AF flags to 1. Otherwise, it 
does not change the AH register and clears the CF and AF flags to 0. In either case, AAA clears bits 
7:4 of the AL register, leaving the correct decimal digit in bits 3:0. 

This instruction also makes it possible to add ASCII numbers without having to mask off the upper 
nibble‘3’. 

MXCSR Flags Affected 

Using this instruction in 64-bit mode generates an invalid-opcode exception. 


Mnemonic 

AAA 


Opcode Description 

o 7 Create an unpacked BCD number. 

(Invalid in 64-bit mode.) 


Related Instructions 

AAD, AAM, AAS 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









U 




U 

U 

M 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 1 5, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 



X 

This instruction was executed in 64-bit mode. 
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AAD ASCII Adjust Before Division 

Converts two unpacked BCD digits in the AL (least significant) and AH (most significant) registers to 
a single binary value in the AL register. 

The instruction is coded without explicit operands: 

AAD 

The instruction performs the following operation on the contents of AL and AH using the formula: 

AL = ((lOd * AH) + (AL)) 

After the conversion, AH is cleared to OOh. 

In most modern assemblers, the AAD instruction adjusts from base-10 values. However, by coding the 
instruction directly in binary, it can adjust from any base specified by the immediate byte value ( ib ) 
suffixed onto the D5h opcode. For example, code D508h for octal, D50Ah for decimal, and D50Ch for 
duodecimal (base 12). 

Using this instruction in 64-bit mode generates an invalid-opcode exception. 


Mnemonic 

Opcode 

Description 

AAD 

D5 0A 

Adjust two BCD digits in AL and AH. 

(Invalid in 64-bit mode.) 

(None) 

D5 ib 

Adjust two BCD digits to the immediate byte base 
(Invalid in 64-bit mode.) 


Related Instructions 

AAA, AAM, AAS 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









U 




M 

M 

U 

M 

U 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 



X 

This instruction was executed in 64-bit mode. 


74 


AAD 


General-Purpose 
Instruction Reference 








24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


AAM ASCII Adjust After Multiply 

Converts the value in the AL register from binary to two unpacked BCD digits in the AH (most 
significant) and AL (least significant) registers. 

The instruction is coded without explicit operands: 

AAM 

The instruction performs the following operation on the contents of AL and AH using the formula: 

AH = (AL/lOd) 

AL = (AL mod lOd) 

In most modern assemblers, the AAM instruction adjusts to base-10 values. However, by coding the 
instruction directly in binary, it can adjust to any base specified by the immediate byte value ( ib ) 
suffixed onto the D4h opcode. For example, code D408h for octal, D40Ah for decimal, and D40Ch for 
duodecimal (base 12). 

Using this instruction in 64-bit mode generates an invalid-opcode exception. 


Mnemonic 

Opcode 

Description 


AAM 

D4 0A 

Create a pair of unpacked 
(Invalid in 64-bit mode.) 

BCD values in AH and AL. 



Create a pair of unpacked 

values to the immediate byte 

(None) 

D4 ib 

base. 


(Invalid in 64-bit mode.) 


Related Instructions 

AAA, AAD, AAS 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









U 




M 

M 

U 

M 

U 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 1 5, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M. Unaffected flags are blank. Undefined 
flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Divide by zero, #DE 

X 

X 

X 

8-bit immediate value was 0. 

Invalid opcode, 

#UD 



X 

This instruction was executed in 64-bit mode. 
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AAS ASCII Adjust After Subtraction 

Adjusts the value in the AL register to an unpacked BCD value. Use the AAS instruction after using 
the SUB instruction to subtract two unpacked BCD numbers. 

The instruction is coded without explicit operands: 

AAS 

If the value in AL is greater than 9 or the AF flag is set to 1, the instruction decrements the value in 
AH, subtracts 6 from the AL register, and sets the CF and AF flags to 1. Otherwise, it clears the CF and 
AF flags and the AH register is unchanged. In either case, the instruction clears bits 7:4 of the AL 
register, leaving the correct decimal digit in bits 3:0. 

Using this instruction in 64-bit mode generates an invalid-opcode exception. 


Mnemonic 

Opcode 

Description 

AAS 

3F 

Create an unpacked BCD number from the contents of 
the AL register. 

(Invalid in 64-bit mode.) 


Related Instructions 

AAA, AAD, AAM 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









U 




U 

U 

M 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 1 5, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 



X 

This instruction was executed in 64-bit mode. 
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ADC Add with Carry 

Adds the carry flag (CF), the value in a register or memory location (first operand), and an immediate 
value or the value in a register or memory location (second operand), and stores the result in the first 
operand location. 

The instruction has two operands: 

ADC dest, src 

The instruction cannot add two memory operands. The CF flag indicates a pending carry from a 
previous addition operation. The instruction sign-extends an immediate value to the length of the 
destination register or memory location. 

This instruction evaluates the result for both signed and unsigned data types and sets the OF and CF 
flags to indicate a carry in a signed or unsigned result, respectively. It sets the SF flag to indicate the 
sign of a signed result. 

Use the ADC instruction after an ADD instruction as part of a multibyte or multiword addition. 

The forms of the ADC instruction that write to memory support the LOCK prefix. For details about the 
LOCK prefix, see “Lock Prefix” on page 11. 


Mnemonic 

Opcode 

Description 

ADC AL, imm8 

14 ib 

Add imm8 to AL + CF. 

ADC AX, imm16 

15 iw 

Add imm16 to AX + CF. 

ADC EAX, imm32 

15 id 

Add imm32 to EAX + CF. 

ADC RAX, imm32 

15 id 

Add sign-extended imm32 to RAX + CF. 

ADC reg/mem8, imm8 

80 12 ib 

Add imm8 to reg/mem8 + CF. 

ADC reg/mem16, imm16 

81 12 iw 

Add imm16 to reg/mem16 + CF. 

ADC reg/mem32, imm32 

81 12 id 

Add imm32 to reg/mem32 + CF. 

ADC reg/mem64 , imm32 

81 12 id 

Add sign-extended imm32 to reg/mem64 + CF. 

ADC reg/mem16, imm8 

83 12 ib 

Add sign-extended imm8 to reg/mem16 + CF. 

ADC reg/mem32, imm8 

83 12 ib 

Add sign-extended imm8 to reg/mem32 + CF. 

ADC reg/mem64, imm8 

83 12 ib 

Add sign-extended imm8 to reg/mem64 + CF. 

ADC reg/mem8, reg8 

10/r 

Add reg8 to reg/mem8 + CF 

ADC reg/mem16, reg16 

11 /r 

Add reg16 to reg/mem16 + CF. 

ADC reg/mem32, reg32 

11 /r 

Add reg32 to reg/mem32 + CF. 

ADC reg/mem64, reg64 

11 /r 

Add reg64 to reg/mem64 + CF. 

ADC reg8, reg/mem8 

12/r 

Add reg/mem8 to reg8 + CF. 

ADC reg16 , reg/mem16 

13/r 

Add reg/mem16 to reg16 + CF. 
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Mnemonic 

ADC reg32, reg/mem32 
ADC reg64 , reg/mem64 


Opcode Description 

13 /r Add reg/mem32 to reg32 + CF. 

13 /r Add reg/mem64 to reg64 + CF. 


Related Instructions 

ADD, SBB, SUB 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 


78 


ADC 


General-Purpose 
Instruction Reference 








24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


ADCX Unsigned ADD with Carry Flag 

Adds the value in a register (first operand) with a register or memory (second operand) and the carry 
flag, and stores the result in the first operand location. 

This instruction sets the CF based on the unsigned addition. This instruction is useful in multi- 
precision addition algorithms. 


Description 

Unsigned add with carryflag 
Unsigned add with carryflag. 

Related Instructions 

ADOX 

rFLAGS Affected 


Mnemonic 

Opcode 

ADCX reg32, reg/mem32 

66 OF 38 F6 

ADCX reg64, reg/mem64 

66 OF 38 F6 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 

















M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are 
blank. Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 

X 

X 

X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the 
instruction. 

Alignment check, #AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 

Invalid opcode, #UD 

X 

X 

X 

Instruction not supported by CPU ID 
Fn0000_0007_EBX[ADX] = 0. 

X 


X 

Lock prefix (FOh) preceding opcode. 
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ADD Signed or Unsigned Add 

Adds the value in a register or memory location (first operand) and an immediate value or the value in 
a register or memory location (second operand), and stores the result in the first operand location. 

The instruction has two operands: 

ADD dest, src 

The instruction cannot add two memory operands. The instruction sign-extends an immediate value to 
the length of the destination register or memory operand. 

This instruction evaluates the result for both signed and unsigned data types and sets the OF and CF 
flags to indicate a carry in a signed or unsigned result, respectively. It sets the SF flag to indicate the 
sign of a signed result. 

The fonns of the ADD instruction that write to memory support the LOCK prefix. For details about the 
LOCK prefix, see “Lock Prefix” on page 11. 


Mnemonic 

Opcode 

Description 


ADD AL, imm8 

04 ib 

Add imm8 to AL. 


ADD AX, imm16 

05 iw 

Add imm16 to AX. 


ADD EAX, imm32 

05 id 

Add imm32 to EAX. 


ADD RAX, imm32 

05 id 

Add sign-extended imm32 to RAX. 

ADD reg/mem8, imm8 

80 /0 ib 

Add imm8 to reg/mem8. 


ADD reg/mem16, imm16 

81 10 iw 

Add imm16 to reg/mem16 


ADD reg/mem32, imm32 

81 10 id 

Add imm32 to reg/mem32. 


ADD reg/mem64 , imm32 

81 10 id 

Add sign-extended imm32 to reg/mem64 

ADD reg/mem16, imm8 

83 10 ib 

Add sign-extended imm8 to 

reg/mem16 

ADD reg/mem32, imm8 

83 /0 ib 

Add sign-extended imm8 to 

reg/mem32. 

ADD reg/mem64, imm8 

83 /0 ib 

Add sign-extended imm8 to 

reg/mem64. 

ADD reg/mem8, reg8 

00/r 

Add reg8 to reg/mem8. 


ADD reg/mem16, reg16 

01 /r 

Add reg16 to reg/mem16. 


ADD reg/mem32, reg32 

01 /r 

Add reg32 to reg/mem32. 


ADD reg/mem64, reg64 

01 /r 

Add reg64 to reg/mem64. 


ADD reg8, reg/mem8 

02/r 

Add reg/mem8 to reg8. 


ADD reg16, reg/mem16 

03 /r 

Add reg/mem16 to reg16. 


ADD reg32 , reg/mem32 

03/r 

Add reg/mem32 to reg32. 


ADD reg64, reg/mem64 

03 /r 

Add reg/mem64 to reg64. 
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Related Instructions 

ADC, SBB, SUB 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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ADOX Unsigned ADD with Overflow Flag 

Adds the value in a register (first operand) with a register or memory (second operand) and the 
overflow flag, and stores the result in the first operand location. 

This instruction sets the OF based on the unsigned addition and whether there is a carry out. This 
instruction is useful in multi-precision addition algorithms. 


Mnemonic 

Opcode 

ADOX reg32, reg/mem32 

F3 OF 38 F6 

ADOX reg64, reg/mem64 

F3 OF 38 F6 

Related Instructions 


ADOX 



rFLAGS Affected 


Description 

Unsigned add with overflow flag 
Unsigned add with overflow flag. 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 









21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are 
blank. Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 

X 

X 

X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the 
instruction. 

Alignment check, #AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 

Invalid opcode, #UD 

X 

X 

X 

Instruction not supported by CPU ID 
Fn0000_0007_EBX[ADX] = 0. 

X 


X 

Lock prefix (FOh) preceding opcode. 
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AND Logical AND 

Performs a bit-wise logical and operation on the value in a register or memory location (first operand) 
and an immediate value or the value in a register or memory location (second operand), and stores the 
result in the first operand location. Both operands cannot be memory locations. 

The instruction has two operands: 

AND dest, src 

The instruction sets each bit of the result to 1 if the corresponding bit of both operands is set; 
otherwise, it clears the bit to 0. The following table shows the truth table for the logical and operation: 


X 

Y 

X and Y 

0 

0 

0 

0 

1 

0 

1 

0 

0 

1 

1 

1 


The forms of the AND instruction that write to memory support the LOCK prefix. For details about the 
LOCK prefix, see “Lock Prefix” on page 11. 


Mnemonic 

Opcode 

Description 

AND AL, imm8 

24 ib 

and the contents of AL with an immediate 8-bit value and store 
the result in AL. 

AND AX, imm16 

25 iw 

and the contents of AX with an immediate 16-bit value and store 
the result in AX. 

AND EAX, imm32 

25 id 

and the contents of EAX with an immediate 32-bit value and 
store the result in EAX. 

AND RAX, imm32 

25 id 

and the contents of RAX with a sign-extended immediate 32-bit 
value and store the result in RAX. 

AND reg/mem8, imm8 

80 14 ib 

and the contents of reg/mem8 with imm8. 

AND reg/mem16, imm16 

81 14 iw 

and the contents of reg/mem16 with imm16. 

AND reg/mem32, imm32 

81 14 id 

and the contents of reg/mem32 with imm32. 

AND reg/mem64, imm32 

81 14 id 

and the contents of reg/mem64 with sign-extended imm32. 

AND reg/mem16, imm8 

83 14 ib 

and the contents of reg/mem 16 with a sign-extended 8-bit value 

AND reg/mem32, imm8 

83 14 ib 

and the contents of reg/mem32 with a sign-extended 8-bit value 

AND reg/mem64, imm8 

83 14 ib 

and the contents of reg/mem64 with a sign-extended 8-bit value 
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Mnemonic 

Opcode 

Description 

AND reg/mem8, reg8 

20 /r 

and the contents of an 8-bit register or memory location with the 
contents of an 8-bit register. 

AND reg/mem16, reg16 

21 /r 

and the contents of a 16-bit register or memory location with the 
contents of a 16-bit register. 

AND reg/mem32, reg32 

21 /r 

and the contents of a 32-bit register or memory location with the 
contents of a 32-bit register. 

AND reg/mem64, reg64 

21 /r 

and the contents of a 64-bit register or memory location with the 
contents of a 64-bit register. 

AND reg8, reg/mem8 

22 /r 

and the contents of an 8-bit register with the contents of an 8-bit 
memory location or register. 

AND reg16, reg/mem16 

23 /r 

and the contents of a 16-bit register with the contents of a 16-bit 
memory location or register. 

AND reg32, reg/mem32 

23 /r 

and the contents of a 32-bit register with the contents of a 32-bit 
memory location or register. 

AND reg64, reg/mem64 

23/r 

and the contents of a 64-bit register with the contents of a 64-bit 
memory location or register. 


Related Instructions 

TEST, OR, NOT, NEG, XOR 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

M 

0 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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ANDN Logical And-Not 

Performs a bit-wise logical and of the second source operand and the one's complement of the first 
source operand and stores the result into the destination operand. 

This instruction has three operands: 

ANDN dest, srcl, src2 

In 64-bit mode, the operand size is determined by the value of VEX. W. If VEX. W is 1, the operand 
size is 64-bit; if VEX. W is 0, the operand size is 32-bit. In 32-bit mode, VEX.W is ignored. 16-bit 
operands are not supported. 

The destination operand (dest) is always a general purpose register. 

The first source operand (srcl) is a general purpose register and the second source operand (src2) is 
either a general purpose register or a memory operand. 

This instruction implements the following operation: 

not tmp, srcl 

and dest, tmp, src2 

The flags are set according to the result of the and pseudo-operation. 

The ANDN instruction is a BMI1 instruction. Support for this instruction is indicated by CPUID 
Fn0000_0007_EBX_x0[BMIl] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



VEX 

RXB.mapselect 

W.vvvv.L.pp 

Opcode 

ANDN reg32, reg32, regimem32 

C4 

RXB.02 

0.srcl .0.00 

F2 /r 

ANDN reg64, reg64, reg/mem64 

C4 

RXB.02 

1.srcl .0.00 

F2 /r 


Related Instructions 

BEXTR, BLCI, BLCIC, BLCMSK, BLCS, BLSFILL, BLSI, BLSIC, BLSR, BLSMSK, BSF, BSR, 
LZCNT, POPCNT, T1MSKC, TZCNT, TZMSK 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

U 

0 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Mode 

Cause of Exception 

Real 

Virt 

Prot 

Invalid opcode, #UD 

X 

X 


BMI instructions are only recognized in protected mode. 



X 

BMI instructions are not supported as indicated by CPUID 
Fn0000_0007_EBX_x0[BMI] = 0. 



X 

VEX.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 


86 


ANDN 


General-Purpose 
Instruction Reference 








24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


BEXTR Bit Field Extract 

(register form) 

Extracts a contiguous field of bits from the first source operand, as specified by the control field setting 
in the second source operand and puts the extracted field into the least significant bit positions of the 
destination. The remaining bits in the destination register are cleared to 0. 

This instruction has three operands: 

BEXTR dest, src , cntl 

In 64-bit mode, the operand size is determined by the value of VEX. W. If VEX. W is 1, the operand 
size is 64-bit; if VEX. W is 0, the operand size is 32-bit. In 32-bit mode, VEX.W is ignored. 16-bit 
operands are not supported. 

The destination (dest) is a general purpose register. 

The source operand (src) is either a general purpose register or a memory operand. 

The control (cntl) operand is a general purpose register that provides two fields describing the range of 
bits to extract: 

• Isb index (in bits 7:0)—specifies the index of the least significant bit of the field 

• length (in bits 15:8)—specifies the number of bits in the field. 

The position of the extracted field can be expressed as: 

[lsb_ index + length - 1] : \lsb index] 

For example, if the lsb_index is 7 and length is 5, then bits 11:7 of the source will be copied to bits 4:0 
of the destination, with the rest of the destination being zero-filled. Zeros are provided for any bit 
positions in the specified range that lie beyond the most significant bit of the source operand. A length 
value of zero results in all zeros being written to the destination. 

This form of the BEXTR instruction is a BMI1 instruction. Support for this instruction is indicated by 
CPUID Fn0000_0007_EBX_x0[BMIl] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic 


Encoding 



VEX 

RXB.mapselect 

W.vvvv.L.pp 

Opcode 

BEXTR reg32, reglmem32, reg32 

C4 

RXB.02 

0.cntl.0.00 

F7 /r 

BEXTR reg64, reg/mem64, reg64 

C4 

RXB.02 

l.cntf.0.00 

F7 /r 
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Related Instructions 

ANDN, BLCI, BLCIC, BLCMSK, BLCS, BLSFILL, BLSI, BLSIC, BLSR, BLSMSK, BSF, BSR, 
LZCNT, POPCNT, T1MSKC, TZCNT, TZMSK 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




U 

M 

U 

U 

0 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Mode 

Cause of Exception 

Real 

Virtual 

8086 

Protected 

Invalid opcode, #UD 

X 

X 


BMI instructions are only recognized in protected mode. 



X 

BMI instructions are not supported, as indicated by 

CPUID Fn0000_0007_EBX_x0[BMI] = 0. 



X 

VEX.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BEXTR Bit Field Extract 

(immediate form) 

Extracts a contiguous field of bits from the first source operand, as specified by the control field setting 
in the second source operand and puts the extracted field into the least significant bit positions of the 
destination. The remaining bits in the destination register are cleared to 0. 

This instruction has three operands: 

BEXTR dest, src , cntl 

In 64-bit mode, the operand size is detennined by the value of XOP.W. If XOP.W is 1, the operand size 
is 64-bit; if XOP.W is 0, the operand size is 32-bit. In 32-bit mode, XOP.W is ignored. 16-bit operands 
are not supported. 

The destination (dest) is a general purpose register. 

The source operand (src) is either a general purpose register or a memory operand. 

The control (cntl) operand is a 32-bit immediate value that provides two fields describing the range of 
bits to extract: 

• Isb index (in immediate operand bits 7:0)—specifies the index of the least significant bit of the 
field 

• length (in immediate operand bits 15:8)—specifies the number of bits in the field. 

The position of the extracted field can be expressed as: 

\lsb_ index + length - 1] : [Isb index] 

For example, if the Isbjndex is 7 and length is 5, then bits 11:7 of the source will be copied to bits 4:0 
of the destination, with the rest of the destination being zero-filled. Zeros are provided for any bit 
positions in the specified range that lie beyond the most significant bit of the source operand. A length 
value of zero results in all zeros being written to the destination. 

This form of the BEXTR instruction is a TBM instruction. Support for this instruction is indicated by 
CPUID Fn8000_0001_ECX[TBM] =1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



XOP 

RXB.map_select 

W.vvvv.L.pp 

Opcode 

BEXTR reg32, reglmem32 , imm32 

8F 

RXB.0A 

0.1111.0.00 

10 /r /id 

BEXTR reg64, reg/mem64, imm32 

8F 

RXB.0A 

1.1111.0.00 

10 /r /id 


General-Purpose 
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BEXTR (immediate form) 
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Related Instructions 

ANDN, BLCI, BLCIC, BLCMSK, BLCS, BLSFILL, BLSI, BLSIC, BLSR, BLSMSK, BSF, BSR, 
LZCNT, POPCNT, T1MSKC, TZCNT, TZMSK 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




U 

M 

U 

U 

0 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 


TBM instructions are only recognized in protected mode. 



X 

TBM instructions are not supported, as indicated by 

CPUID Fn8000_0001_ECX[TBM] = 0. 



X 

XOP.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BLCFILL Fill From Lowest Clear Bit 

Finds the least significant zero bit in the source operand, clears all bits below that bit to 0 and writes 
the result to the destination. If there is no zero bit in the source operand, the destination is written with 
all zeros. 

This instruction has two operands: 

BLCFILL dest, src 

In 64-bit mode, the operand size is detennined by the value of XOP.W. If XOP.W is 1, the operand size 
is 64-bit; if XOP.W is 0, the operand size is 32-bit. In 32-bit mode, XOP.W is ignored. 16-bit operands 
are not supported. 

The destination (dest) is a general purpose register. 

The source operand (src) is a general purpose register or a memory operand. 

The BLCFILL instruction effectively performs a bit-wise logical and of the source operand and the 
result of incrementing the source operand by 1 and stores the result to the destination register: 

add tmp, src, 1 
and dest,tmp, src 

The value of the carry flag of rFLAGS is generated according to the result of the add pseudo¬ 
instruction and the remaining arithmetic flags are generated by the and pseudo-instruction. 

The BLCFILL instruction is a TBM instruction. Support for this instruction is indicated by CPUID 
Fn8000_0001_ECX[TBM] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



XOP 

RXB.mapselect 

W.vvvv.L.pp 

Opcode 

BLCFILL reg32, reg/mem32 

8F 

RXB.09 

0.dest.0.00 

01 /I 

BLCFILL reg64, reg/mem64 

8F 

RXB.09 

1 .dest.0.00 

01 /I 


Related Instructions 

ANDN, BEXTR, BLCI, BLCIC, BLCMSK, BLCS, BLSFILL, BLSI, BLSIC, BLSR, BLSMSK, BSF, 
BSR, LZCNT, POPCNT, T1MSKC, TZCNT, TZMSK 


General-Purpose 
Instruction Reference 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 


TBM instructions are only recognized in protected mode. 



X 

TBM instructions are not supported, as indicated by 

CPUID Fn8000_0001_ECX[TBM] = 0. 



X 

XOP.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BLCI Isolate Lowest Clear Bit 

Finds the least significant zero bit in the source operand, sets all other bits to 1 and writes the result to 
the destination. If there is no zero bit in the source operand, the destination is written with all ones. 

This instruction has two operands: 

BLCI dest, src 

In 64-bit mode, the operand size is detennined by the value of XOP.W. If XOP.W is 1, the operand size 
is 64-bit; if XOP.W is 0, the operand size is 32-bit. In 32-bit mode, XOP.W is ignored. 16-bit operands 
are not supported. 

The destination (dest) is a general purpose register. 

The source operand (src) is a general purpose register or a memory operand. 

The BLCI instruction effectively performs a bit-wise logical or of the source operand and the inverse 
of the result of incrementing the source operand by 1, and stores the result to the destination register: 

add tmp, src, 1 
not tmp, tmp 
or dest, tmp, src 

The value of the carry flag of rFLAGS is generated according to the result of the add pseudo¬ 
instruction and the remaining arithmetic flags are generated by the or pseudo-instruction. 

The BLCI instruction is a TBM instruction. Support for this instruction is indicated by CPUID 
Fn8000_0001_ECX[TBM] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



XOP 

RXB.mapselect 

W.vvvv.L.pp 

Opcode 

BLCI reg32, reg/mem32 

8F 

RXB.09 

0.dest.0.00 

02 16 

BLCI reg64, reg/mem64 

8F 

RXB.09 

1 .dest.0.00 

02 16 


Related Instructions 

ANDN, BEXTR, BLCFILL, BLCIC, BLCMSK, BLCS, BLSFILL, BLSI, BLSIC, BLSR, BLSMSK, 
BSF, BSR, LZCNT, POPCNT, T1MSKC, TZCNT, TZMSK 


General-Purpose 
Instruction Reference 


BLCI 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 


TBM instructions are only recognized in protected mode. 



X 

TBM instructions are not supported, as indicated by 

CPUID Fn8000_0001_ECX[TBM] = 0. 



X 

XOP.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BLCIC Isolate Lowest Clear Bit and Complement 

Finds the least significant zero bit in the source operand, sets that bit to 1, clears all other bits to 0 and 
writes the result to the destination. If there is no zero bit in the source operand, the destination is 
written with all zeros. 

This instruction has two operands: 

BLCIC dest, src 

In 64-bit mode, the operand size is detennined by the value of XOP.W. If XOP.W is 1, the operand size 
is 64-bit; if XOP.W is 0, the operand size is 32-bit. In 32-bit mode, XOP.W is ignored. 16-bit operands 
are not supported. 

The destination (dest) is a general purpose register. 

The source operand (src) is a general purpose register or a memory operand. 

The BLCIC instruction effectively performs a bit-wise logical and of the negation of the source 
operand and the result of incrementing the source operand by 1, and stores the result to the destination 
register: 

add tmpl, src, 1 

not tmp2, src 

and dest, tmpl,tmp2 

The value of the carry flag of rFLAGS is generated according to the result of the add pseudo¬ 
instruction and the remaining arithmetic flags are generated by the and pseudo-instruction. 

The BLCIC instruction is a TBM instruction. Support for this instruction is indicated by CPUID 
Fn8000_0001_ECX[TBM] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



XOP 

RXB.map_select 

W.vvvv.L.pp 

Opcode 

BLCIC reg32, reg/mem32 

8F 

RXB.09 

O.dest.O.OO 

01 15 

BLCIC reg64, reg/mem64 

8F 

RXB.09 

1 .dest.0.00 

01 15 


Related Instructions 

ANDN, BEXTR, BLCFILL, BLCI, BLCMSK, BLCS, BLSFILL, BLSI, BLSIC, BLSR, BLSMSK, 
BSF, BSR, LZCNT, POPCNT, T1MSKC, TZCNT, TZMSK 


General-Purpose 
Instruction Reference 


BLCIC 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 


TBM instructions are only recognized in protected mode. 



X 

TBM instructions are not supported, as indicated by 

CPUID Fn8000_0001_ECX[TBM] = 0. 



X 

XOP.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BLCMSK Mask From Lowest Clear Bit 

Finds the least significant zero bit in the source operand, sets that bit to 1, clears all bits above that bit 
to 0 and writes the result to the destination. If there is no zero bit in the source operand, the destination 
is written with all ones. 

This instruction has two operands: 

BLCMSK dest, src 

In 64-bit mode, the operand size is detennined by the value of XOP.W. If XOP.W is 1, the operand size 
is 64-bit; if XOP.W is 0, the operand size is 32-bit. In 32-bit mode, XOP.W is ignored. 16-bit operands 
are not supported. 

The destination (dest) is a general purpose register. 

The source operand (src) is a general purpose register or a memory operand. 

The BLCMSK instruction effectively performs a bit-wise logical xor of the source operand and the 
result of incrementing the source operand by 1 and stores the result to the destination register: 

add tmpl, src, 1 
xor dest, tmpl,src 

The value of the carry flag of rFLAGS is generated according to the result of the add pseudo¬ 
instruction and the remaining arithmetic flags are generated by the xor pseudo-instruction. 

If the input is all ones, the output is a value with all bits set to 1. 

The BLCMSK instruction is a TBM instruction. Support for this instruction is indicated by CPUID 
Fn8000_0001_ECX[TBM] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Instruction Encoding 

Mnemonic 

XOP 

Encoding 

RXB.mapselect W.vvvv.L.pp 

Opcode 

BLCMSK reg32, reg/mem32 

8F 

RXB.09 0.dest.0.00 

02/1 

BLCMSK reg64, reg/mem64 

8F 

RXB.09 1 .dest.0.00 

02/1 

Related Instructions 





ANDN, BEXTR, BLCFILL, BLCI, BLCS, BLSFILL, BLSI, BLSIC, BLSR, BLSMSK, BSF, BSR, 
LZCNT, POPCNT, T1MSKC, TZCNT, TZMSK 


General-Purpose 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 


TBM instructions are only recognized in protected mode. 



X 

TBM instructions are not supported, as indicated by 

CPUID Fn8000_0001_ECX[TBM] = 0. 



X 

XOP.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BLCS Set Lowest Clear Bit 

Finds the least significant zero bit in the source operand, sets that bit to 1 and writes the result to the 
destination. If there is no zero bit in the source operand, the source is copied to the destination (and CF 
in rFLAGS is set to 1). 

This instruction has two operands: 

BLCS dest, src 

In 64-bit mode, the operand size is detennined by the value of XOP.W. If XOP.W is 1, the operand size 
is 64-bit; if XOP.W is 0, the operand size is 32-bit. In 32-bit mode, XOP.W is ignored. 16-bit operands 
are not supported. 

The destination (dest) is a general purpose register. 

The source operand (src) is a general purpose register or a memory operand. 

The BLCS instruction effectively perfonns a bit-wise logical or of the source operand and the result 
of incrementing the source operand by 1, and stores the result to the destination register: 

add tmp, src, 1 
or dest, tmp, src 

The value of the carry flag of rFLAGS is generated by the add pseudo-instruction and the remaining 
arithmetic flags are generated by the or pseudo-instruction. 

The BLCS instruction is a TBM instruction. Support for this instruction is indicated by CPUID 
Fn8000_0001_ECX[TBM] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic 


Encoding 



XOP 

RXB.map_select 

W.vvvv.L.pp 

Opcode 

BLCS reg32, reg/mem32 

8F 

RXB.09 

O.dest.O.OO 

01 13 

BLCS reg64, reg/mem64 

Related Instructions 

8F 

RXB.09 

l.dest.0.00 

01 13 


ANDN, BEXTR, BLCFILL, BLCI, BLCIC, BLCMSK, BLSFILL, BLSI, BLSIC, BLSR, BLSMSK, 
BSF, BSR, LZCNT, POPCNT, T1MSKC, TZCNT, TZMSK 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 


TBM instructions are only recognized in protected mode. 



X 

TBM instructions are not supported, as indicated by 

CPUID Fn8000_0001_ECX[TBM] = 0. 



X 

XOP.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BLSFILL Fill From Lowest Set Bit 

Finds the least significant one bit in the source operand, sets all bits below that bit to 1 and writes the 
result to the destination. If there is no one bit in the source operand, the destination is written with all 
ones. 

This instruction has two operands: 

BLSFILL dest, src 

In 64-bit mode, the operand size is detennined by the value of XOP.W. If XOP.W is 1, the operand size 
is 64-bit; if XOP.W is 0, the operand size is 32-bit. In 32-bit mode, XOP.W is ignored. 16-bit operands 
are not supported. 

The destination (dest) is a general purpose register. 

The source operand (src) is a general purpose register or a memory operand. 

The BLSFILL instruction effectively performs a bit-wise logical or of the source operand and the 
result of subtracting 1 from the source operand, and stores the result to the destination register: 

sub tmp, src, 1 
or dest, tmp, src 

The value of the carry flag of rFLAGs is generated by the sub pseudo-instruction and the remaining 
arithmetic flags are generated by the or pseudo-instruction. 

The BLSFILL instruction is a TBM instruction. Support for this instruction is indicated by CPUID 
Fn8000_0001_ECX[TBM] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic 


Encoding 



XOP 

RXB.map_select 

W.vvvv.L.pp 

Opcode 

BLSFILL reg32, reg/mem32 

8F 

RXB.09 

O.dest.O.OO 

01 12 

BLSFILL reg64, reg/mem64 

Related Instructions 

8F 

RXB.09 

l.dest.0.00 

01 12 


ANDN, BEXTR, BLCFILL, BLCI, BLCIC, BLCMSK, BLCS, BLSI, BLSIC, BLSR, BLSMSK, BSF, 
BSR, LZCNT, POPCNT, T1MSKC, TZCNT, TZMSK 


General-Purpose 
Instruction Reference 


BLSFILL 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 


TBM instructions are only recognized in protected mode. 



X 

TBM instructions are not supported, as indicated by 

CPUID Fn8000_0001_ECX[TBM] = 0. 



X 

XOP.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BLSI Isolate Lowest Set Bit 

Clears all bits in the source operand except for the least significant bit that is set to 1 and writes the 
result to the destination. If the source is all zeros, the destination is written with all zeros. 

This instruction has two operands: 

BLSI dest, src 

In 64-bit mode, the operand size is determined by the value of VEX. W. If VEX. W is 1, the operand 
size is 64-bit; if VEX. W is 0, the operand size is 32-bit. In 32-bit mode, VEX.W is ignored. 16-bit 
operands are not supported. 

The destination (dest) is a general purpose register. 

The source operand (src) is either a general purpose register or a bit memory operand. 

This instruction implements the following operation: 

neg tmp, srcl 
and dst, tmp, srcl 

The value of the carry flag is generated by the neg pseudo-instruction and the remaining status flags 
are generated by the and pseudo-instruction. 

The BLSI instruction is a BMI1 instruction. Support for this instruction is indicated by CPUID 
Fn0000_0007_EBX_x0[BMIl] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



VEX 

RXB.mapselect 

W.vvvv.L.pp 

Opcode 

BLSI reg32, regimem32 

C4 

RXB.02 

O.dest.O.OO 

F3/3 

BLSI reg64, reg/mem64 

C4 

RXB.02 

1 .dest. 0.00 

F3/3 


Related Instructions 

ANDN, BEXTR, BLCI, BLCIC, BLCMSK, BLCS, BLSFILL, BLSIC, BLSR, BLSMSK, BSF, BSR, 
LZCNT, POPCNT, T1MSKC, TZCNT, TZMSK 


General-Purpose 
Instruction Reference 


BLSI 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Mode 

Cause of Exception 

Real 

Virtual 

8086 

Protected 

Invalid opcode, #UD 

X 

X 


BMI instructions are only recognized in protected mode. 



X 

BMI instructions are not supported, as indicated by 

CPU ID Fn0000_0007_EBX_x0[BMI] = 0. 



X 

VEX.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the 
instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BLSIC Isolate Lowest Set Bit and Complement 

Finds the least significant bit that is set to 1 in the source operand, clears that bit to 0, sets all other bits 
to 1 and writes the result to the destination. If there is no one bit in the source operand, the destination 
is written with all ones. 

This instruction has two operands: 

BLSIC dest, src 

In 64-bit mode, the operand size is detennined by the value of XOP.W. If XOP.W is 1, the operand size 
is 64-bit; if XOP.W is 0, the operand size is 32-bit. In 32-bit mode, XOP.W is ignored. 16-bit operands 
are not supported. 

The destination (dest) is a general purpose register. 

The source operand (src) is a general purpose register or a memory operand. 

The BLSIC instruction effectively performs a bit-wise logical or of the inverse of the source operand 
and the result of subtracting 1 from the source operand, and stores the result to the destination register: 

sub tmpl, src, 1 
not tmp2, src 
or dest, tmpl, tmp2 

The value of the carry flag of rFLAGS is generated by the sub pseudo-instruction and the remaining 
arithmetic flags are generated by the or pseudo-instruction. 

The BLSR instruction is a TBM instruction. Support for this instruction is indicated by CPUID 
Fn8000_0001_ECX[TBM] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



XOP 

RXB.map_select 

W.vvvv.L.pp 

Opcode 

BLSIC reg32, reg/mem32 

8F 

RXB.09 

0.dest.0.00 

01 16 

BLSIC reg64, reg/mem64 

8F 

RXB.09 

1 .dest.0.00 

01 16 


Related Instructions 

ANDN, BEXTR, BLCFILL, BLCI, BLCIC, BLCMSK, BLCS, BLSFILL, BLSI, BLSIC, BLSR, 
BLSMSK, BSF, BSR, LZCNT, POPCNT, T1MSKC, TZCNT, TZMSK 


General-Purpose 
Instruction Reference 


BLSIC 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 


TBM instructions are only recognized in protected mode. 



X 

TBM instructions are not supported, as indicated by 

CPUID Fn8000_0001_ECX[TBM] = 0. 



X 

XOP.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BLSMSK Mask From Lowest Set Bit 

Forms a mask with bits set to 1 from bit 0 up to and including the least significant bit position that is set 
to 1 in the source operand and writes the mask to the destination. If the value of the source operand is 
zero, the destination is written with all ones. 

This instruction has two operands: 

BLSMSK dest, src 

In 64-bit mode, the operand size is determined by the value of VEX. W. If VEX. W is 1, the operand 
size is 64-bit; if VEX. W is 0, the operand size is 32-bit. In 32-bit mode, VEX.W is ignored. 16-bit 
operands are not supported. 

The destination (dest) is always a general purpose register. 

The source operand (src) is either a general purpose register or a memory operand and the destination 
operand (dest) is a general purpose register. 

This instruction implements the operation: 

sub tmp, srcl, 1 
xor dst, tmp, srcl 

The value of the carry flag is generated by the sub pseudo-instruction and the remaining status flags 
are generated by the xor pseudo-instruction. 

If the input is zero, the output is a value with all bits set to 1. If this is considered a corner case input, 
software may test the carry flag to detect the zero input value. 

The BLSMSK instruction is a BMI1 instruction. Support for this instruction is indicated by CPUID 
Fn0000_0007_EBX_x0[BMIl] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



VEX 

RXB.map_select 

W.vvvv.L.pp 

Opcode 

BLSMSK reg32, regtmem32 

C4 

RXB.02 

O.dest.O.OO 

F3/2 

BLSMSK reg64, reg/mem64 

C4 

RXB.02 

1 .dest. 0.00 

F3/2 


Related Instructions 

ANDN, BEXTR, BLCI, BLCIC, BLCMSK, BLCS, BLSFILL, BLSI, BLSIC, BLSR, BSF, BSR, 
LZCNT, POPCNT, T1MSKC, TZCNT, TZMSK 


General-Purpose 
Instruction Reference 


BLSMSK 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Mode 

Cause of Exception 

Real 

Virtual 

8086 

Protected 

Invalid opcode, #UD 

X 

X 


BMI instructions are only recognized in protected mode. 



X 

BMI instructions are not supported, as indicated by 

CPUID Fn0000_0007_EBX_x0[BMI] = 0. 



X 

VEX.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BLSR Reset Lowest Set Bit 

Clears the least-significant bit that is set to 1 in the input operand and writes the modified operand to 
the destination. 

This instruction has two operands: 

BLSR dest, src 

In 64-bit mode, the operand size is determined by the value of VEX. W. If VEX. W is 1, the operand 
size is 64-bit; if VEX. W is 0, the operand size is 32-bit. In 32-bit mode, VEX.W is ignored. 16-bit 
operands are not supported. 

The destination (dest) is always a general purpose register. 

The source operand (src) is either a general purpose register or a memory operand. 

This instruction implements the operation: 

sub tmp, srcl, 1 
and dst, tmp, srcl 

The value of the carry flag is generated by the sub pseudo-instruction and the remaining status flags 
are generated by the and pseudo-instruction. 

The BLSR instruction is a BMI1 instruction. Support for this instruction is indicated by CPUID 
Fn0000_0007_EBX_x0[BMIl] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



VEX 

RXB.map_select 

W.vvvv.L.pp 

Opcode 

BLSR reg32, reglmem32 

C4 

RXB.02 

O.dest.O.OO 

F3/1 

BLSR reg64, reg/mem64 

C4 

RXB.02 

1 .dest. 0.00 

F3/1 


Related Instructions 

ANDN, BEXTR, BLCI, BLCIC, BLCMSK, BLCS, BLSFILL, BLSI, BLSIC, BLSMSK, BSF, BSR, 
LZCNT, POPCNT, T1MSKC, TZCNT, TZMSK 


General-Purpose 
Instruction Reference 


BLSR 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Mode 

Cause of Exception 

Real 

Virtual 

8086 

Protected 

Invalid opcode, #UD 

X 

X 


BMI instructions are only recognized in protected mode. 



X 

BMI instructions are not supported, as indicated by 

CPUID Fn0000_0007_EBX_x0[BMI] = 0. 



X 

VEX.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BOUND Check Array Bound 

Checks whether an array index (first operand) is within the bounds of an array (second operand). The 
array index is a signed integer in the specified register. If the operand-size attribute is 16, the array 
operand is a memory location containing a pair of signed word-integers; if the operand-size attribute is 
32, the array operand is a pair of signed doubleword-integers. The first word or doubleword specifies 
the lower bound of the array and the second word or doubleword specifies the upper bound. 

The array index must be greater than or equal to the lower bound and less than or equal to the upper 
bound. If the index is not within the specified bounds, the processor generates a BOUND range- 
exceeded exception (#BR). 

The bounds of an array, consisting of two words or doublewords containing the lower and upper limits 
of the array, usually reside in a data structure just before the array itself, making the limits addressable 
through a constant offset from the beginning of the array. With the address of the array in a register, 
this practice reduces the number of bus cycles required to determine the effective address of the array 
bounds. 

Using this instruction in 64-bit mode generates an invalid-opcode exception. 


Mnemonic 

Opcode 

Description 

BOUND reg16, mem16&mem16 

62 /r 

Test whether a 16-bit array index is within the bounds 
specified by the two 16-bit values in mem16&mem16. 
(Invalid in 64-bit mode.) 

BOUND reg32, mem32&mem32 

62 /r 

Test whether a 32-bit array index is within the bounds 
specified by the two 32-bit values in mem32&mem32. 
(Invalid in 64-bit mode.) 


Related Instructions 

INT, INT3, INTO 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Bound range, #BR 

X 

X 

X 

The bound range was exceeded. 

Invalid opcode, 

#UD 

X 

X 

X 

The source operand was a register. 



X 

Instruction was executed in 64-bit mode. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit. 



X 

A null data segment was used to reference memory. 
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Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BSF Bit Scan Forward 

Searches the value in a register or a memory location (second operand) for the least-significant set bit. 
If a set bit is found, the instruction clears the zero flag (ZF) and stores the index of the least-significant 
set bit in a destination register (first operand). If the second operand contains 0, the instruction sets ZF 
to 1 and does not change the contents of the destination register. The bit index is an unsigned offset 
from bit 0 of the searched value. 


Mnemonic 

Opcode 

Description 

BSF reg16, reg/mem16 

OF BC /r 

Bit scan forward on the contents of reg/mem16 

BSF reg32, reg/mem32 

OF BC /r 

Bit scan forward on the contents of reg/mem32 

BSF reg64, reg/mem64 

OF BC /r 

Bit scan forward on the contents of reg/mem64 

Related Instructions 




BSR 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









U 




U 

M 

U 

U 

U 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 


General-Purpose 
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BSR Bit Scan Reverse 

Searches the value in a register or a memory location (second operand) for the most-significant set bit. 
If a set bit is found, the instruction clears the zero flag (ZF) and stores the index of the most-significant 
set bit in a destination register (first operand). If the second operand contains 0, the instruction sets ZF 
to 1 and does not change the contents of the destination register. The bit index is an unsigned offset 
from bit 0 of the searched value. 


Mnemonic Opcode 


BSR reg16, reg/mem16 

OF BD 

BSR reg32, reg/mem32 

OF BD 

BSR reg64, reg/mem64 

OF BD 

Related Instructions 



Description 

Bit scan reverse on the contents of reg/mem16 
Bit scan reverse on the contents of reg/mem32 
Bit scan reverse on the contents of reg/mem64 


BSF 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









U 




U 

M 

U 

U 

U 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded the data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BSWAP Byte Swap 

Reverses the byte order of the specified register. This action converts the contents of the register from 
little endian to big endian or vice versa. In a doubleword, bits 7:0 are exchanged with bits 31:24, and 
bits 15:8 are exchanged with bits 23:16. In a quadword, bits 7:0 are exchanged with bits 63:56, bits 
15:8 with bits 55:48, bits 23:16 with bits 47:40, and bits 31:24 with bits 39:32. A subsequent use of the 
BSWAP instruction with the same operand restores the original value of the operand. 

The result of applying the BSWAP instruction to a 16-bit register is undefined. To swap the bytes of a 
16-bit register, use the XCHG instruction and specify the respective byte halves of the 16-bit register 
as the two operands. For example, to swap the bytes of AX, use XCHG AL, AH. 


Mnemonic 

Opcode 

Description 

BSWAP reg32 

OF C8 +rd 

Reverse the byte order of reg32. 

BSWAP reg64 

OF C8 +rq 

Reverse the byte order of reg64. 


Related Instructions 

XCHG 

rFLAGS Affected 

None 

Exceptions 

None 


General-Purpose 
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BT Bit Test 

Copies a bit, specified by a bit index in a register or 8-bit immediate value (second operand), from a bit 
string (first operand), also called the bit base, to the carry flag (CF) of the rFLAGS register. 

If the bit base operand is a register, the instruction uses the modulo 16, 32, or 64 (depending on the 
operand size) of the bit index to select a bit in the register. 

If the bit base operand is a memory location, bit 0 of the byte at the specified address is the bit base of 
the bit string. If the bit index is in a register, the instruction selects a bit position relative to the bit base 
in the range -2 63 to +2 63 - 1 if the operand size is 64, -2 31 to +2 31 - 1, if the operand size is 32, and - 
2 15 to +2 15 - 1 if the operand size is 16. If the bit index is in an immediate value, the bit selected is that 
value modulo 16, 32, or 64, depending on operand size. 

When the instruction attempts to copy a bit from memory, it accesses 2, 4, or 8 bytes starting from the 
specified memory address for 16-bit, 32-bit, or 64-bit operand sizes, respectively, using the following 
fonnula: 

Effective Address + (NumBytesj * (BitOffset DIV NumBitSj*g)) 

When using this bit addressing mechanism, avoid referencing areas of memory close to address space 
holes, such as references to memory-mapped I/O registers. Instead, use a MOV instruction to load a 
register from such an address and use a register form of the BT instruction to manipulate the data. 


Mnemonic 

Opcode 

BT reg/mem16, reg16 

OF A3 It 

BT reg/mem32, reg32 

OF A3 It 

BT reg/mem64, reg64 

OF A3 It 

BT reg/mem16, imm8 

OF BA/4/6 

BT reg/mem32, imm8 

OF BA/4/6 

BT reg/mem64, imm8 

OF BA/4/6 


Related Instructions 

BTC, BTR, BTS 


Description 

Copy the value of the selected bit to the carry flag. 

Copy the value of the selected bit to the carry flag. 

Copy the value of the selected bit to the carry flag. 

Copy the value of the selected bit to the carry flag. 

Copy the value of the selected bit to the carry flag. 

Copy the value of the selected bit to the carry flag. 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









U 




U 

U 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BTC Bit Test and Complement 

Copies a bit, specified by a bit index in a register or 8-bit immediate value (second operand), from a bit 
string (first operand), also called the bit base, to the carry flag (CF) of the rFLAGS register, and then 
complements (toggles) the bit in the bit string. 

If the bit base operand is a register, the instruction uses the modulo 16, 32, or 64 (depending on the 
operand size) of the bit index to select a bit in the register. 

If the bit base operand is a memory location, bit 0 of the byte at the specified address is the bit base of 
the bit string. If the bit index is in a register, the instruction selects a bit position relative to the bit base 
in the range -2 63 to +2 63 - 1 if the operand size is 64, -2 31 to +2 31 — 1, if the operand size is 32, and - 
2 15 to +2 15 - 1 if the operand size is 16. If the bit index is in an immediate value, the bit selected is that 
value modulo 16, 32, or 64, depending the operand size. 

This instruction is useful for implementing semaphores in concurrent operating systems. Such an 
application should precede this instruction with the LOCK prefix. For details about the LOCK prefix, 
see “Lock Prefix” on page 11. 


Mnemonic 

Opcode 

Description 

BTC reg/mem16, reg16 

OF BB/r 

Copy the value of the selected bit to the carry flag, then 
complement the selected bit. 

BTC reg/mem32, reg32 

OF BB/r 

Copy the value of the selected bit to the carry flag, then 
complement the selected bit. 

BTC reg/mem64, reg64 

OF BB/r 

Copy the value of the selected bit to the carry flag, then 
complement the selected bit. 

BTC reg/mem16, imm8 

OF BA/7/6 

Copy the value of the selected bit to the carry flag, then 
complement the selected bit. 

BTC reg/mem32, imm8 

OF BA/7/6 

Copy the value of the selected bit to the carry flag, then 
complement the selected bit. 

BTC reg/mem64, imm8 

OF BA/7/6 

Copy the value of the selected bit to the carry flag, then 
complement the selected bit. 


Related Instructions 

BT, BTR, BTS 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









U 




U 

U 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BTR Bit Test and Reset 

Copies a bit, specified by a bit index in a register or 8-bit immediate value (second operand), from a bit 
string (first operand), also called the bit base, to the carry flag (CF) of the rFLAGS register, and then 
clears the bit in the bit string to 0. 

If the bit base operand is a register, the instruction uses the modulo 16, 32, or 64 (depending on the 
operand size) of the bit index to select a bit in the register. 

If the bit base operand is a memory location, bit 0 of the byte at the specified address is the bit base of 
the bit string. If the bit index is in a register, the instruction selects a bit position relative to the bit base 
in the range -2 63 to +2 63 - 1 if the operand size is 64, -2 31 to +2 31 — 1, if the operand size is 32, and - 
2 15 to +2 15 - 1 if the operand size is 16. If the bit index is in an immediate value, the bit selected is that 
value modulo 16, 32, or 64, depending on the operand size. 

This instruction is useful for implementing semaphores in concurrent operating systems. Such 
applications should precede this instruction with the LOCK prefix. For details about the LOCK prefix, 
see “Lock Prefix” on page 11. 

Description 

Copy the value of the selected bit to the carry flag, then 
clear the selected bit. 

Copy the value of the selected bit to the carry flag, then 
clear the selected bit. 

Copy the value of the selected bit to the carry flag, then 
clear the selected bit. 

Copy the value of the selected bit to the carry flag, then 
clear the selected bit. 

Copy the value of the selected bit to the carry flag, then 
clear the selected bit. 

Copy the value of the selected bit to the carry flag, then 
clear the selected bit. 

BT, BTC, BTS 


Mnemonic 

Opcode 

BTR reg/mem16, reg16 

OF B3 /r 

BTR reg/mem32, reg32 

OF B3 /r 

BTR reg/mem64, reg64 

OF B3 /r 

BTR reg/mem16, imm8 

OF BA/6/6 

BTR reg/mem32, imm8 

OF BA/6/6 

BTR reg/mem64, imm8 

OF BA/6/6 


Related Instructions 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









U 




U 

U 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BTS Bit Test and Set 

Copies a bit, specified by bit index in a register or 8-bit immediate value (second operand), from a bit 
string (first operand), also called the bit base, to the carry flag (CF) of the rFLAGS register, and then 
sets the bit in the bit string to 1. 

If the bit base operand is a register, the instruction uses the modulo 16, 32, or 64 (depending on the 
operand size) of the bit index to select a bit in the register. 

If the bit base operand is a memory location, bit 0 of the byte at the specified address is the bit base of 
the bit string. If the bit index is in a register, the instruction selects a bit position relative to the bit base 
in the range -2 63 to +2 63 - 1 if the operand size is 64, -2 31 to +2 31 — 1, if the operand size is 32, and - 
2 15 to +2 15 - 1 if the operand size is 16. If the bit index is in an immediate value, the bit selected is that 
value modulo 16, 32, or 64, depending on the operand size. 

This instruction is useful for implementing semaphores in concurrent operating systems. Such 
applications should precede this instruction with the LOCK prefix. For details about the LOCK prefix, 
see “Lock Prefix” on page 11. 

Description 

Copy the value of the selected bit to the carry flag, then 
set the selected bit. 

Copy the value of the selected bit to the carry flag, then 
set the selected bit. 

Copy the value of the selected bit to the carry flag, then 
set the selected bit. 

Copy the value of the selected bit to the carry flag, then 
set the selected bit. 

Copy the value of the selected bit to the carry flag, then 
set the selected bit. 

Copy the value of the selected bit to the carry flag, then 
set the selected bit. 

BT, BTC, BTR 


Mnemonic 

Opcode 

BTS reg/mem16, reg16 

OFAB /r 

BTS reg/mem32, reg32 

OFAB /r 

BTS reg/mem64, reg64 

OFAB /r 

BTS reg/mem16, imm8 

OF BA/5 ib 

BTS reg/mem32, imm8 

OF BA/5 ib 

BTS reg/mem64, imm8 

OF BA/5 ib 


Related Instructions 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









U 




U 

U 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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BZHI Zero High Bits 

Copies bits, left to right, from the first source operand starting with the bit position specified by the 
second source operand {index), writes these bits to the destination, and clears all the bits in positions 
greater than or equal to index. 

This instruction has three operands: 

BZHI dest, src, index 

In 64-bit mode, the operand size ( op_size ) is detennined by the value of VEX. W. If VEX. W is 1, the 
operand size is 64 bits; if VEX. W is 0, the operand size is 32 bits. In 32-bit mode, VEX.W is ignored. 
16-bit operands are not supported. 

The destination {dest) is a general purpose register. The first source operand {src) is either a general 
purpose register or a memory operand. The second source operand is a general purpose register. Bits 
[7:0] of this register, treated as an unsigned 8-bit integer, specify the index of the most-significant bit 
of the first source operand to be copied to the corresponding bit of the destination. Bits [op_size- 
1 -.index] of the destination are cleared. 

If the value of index is greater than or equal to the operand size, index is set to {op_size- 1). In this case, 
the CF flag is set. 

This instruction is a BMI2 instruction. Support for this instruction is indicated by CPUID 
Fn0000_0007_EBX_x0[BMI2] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic 


Encoding 



VEX 

RXB.mapselect 

W.vvvv.L.pp 

Opcode 

BZHI reg32, regimem32, reg32 

C4 

RXB.02 

O.index.O.OO 

F5 /r 

BZHI reg64, reg/mem64, reg64 

C4 

RXB.02 

l./bdex.O.OO 

F5 /r 


Related Instructions 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Mode 

Cause of Exception 

Real 

Virtual 

8086 

Protected 

Invalid opcode, #UD 

X 

X 


BMI2 instructions are only recognized in protected mode. 



X 

BMI2 instructions are not supported, as indicated by 
CPUID Fn0000_0007_EBX_x0[BMI2] = 0. 



X 

VEX.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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CALL (Near) Near Procedure Call 

Pushes the offset of the next instruction onto the stack and branches to the target address, which 
contains the first instruction of the called procedure. The target operand can specify a register, a 
memory location, or a label. A procedure accessed by a near CALL is located in the same code 
segment as the CALL instruction. 

If the CALL target is specified by a register or memory location, then a 16-, 32-, or 64-bit rIP is read 
from the operand, depending on the operand size. A 16- or 32-bit rIP is zero-extended to 64 bits. 

If the CALL target is specified by a displacement, the signed displacement is added to the rIP (of the 
following instruction), and the result is truncated to 16, 32, or 64 bits, depending on the operand size. 
The signed displacement is 16 or 32 bits, depending on the operand size. 

In all cases, the rIP of the instruction after the CALL is pushed on the stack, and the size of the stack 
push (16, 32, or 64 bits) depends on the operand size of the CALL instruction. 

For near calls in 64-bit mode, the operand size defaults to 64 bits. The E8 opcode results in 
RIP = RIP + 32-bit signed displacement and the FF /2 opcode results in RIP = 64-bit offset from 
register or memory. No prefix is available to encode a 32-bit operand size in 64-bit mode. 

At the end of the called procedure, RET is used to return control to the instruction following the 
original CALL. When RET is executed, the rIP is popped off the stack, which returns control to the 
instruction after the CALL. 

See CALL (Far) for information on far calls—calls to procedures located outside of the current code 
segment. For details about control-flow instructions, see “Control Transfers” in Volume 1, and 
“Control-Transfer Privilege Checks” in Volume 2. 


Mnemonic 

Opcode 

Description 

CALL rell 6off 

E8 iw 

Near call with the target specified by a 16-bit relative 
displacement. 

CALL rel32off 

E8 id 

Near call with the target specified by a 32-bit relative 
displacement. 

CALL reg/mem16 

FF/2 

Near call with the target specified by reg/mem16. 

CALL reg/mem32 

FF/2 

Near call with the target specified by reg/mem32. (There 
is no prefix for encoding this in 64-bit mode.) 

CALL reg/mem64 

FF/2 

Near call with the target specified by reg/mem64. 


For details about control-flow instructions, see “Control Transfers” in Volume 1, and “Control- 
Transfer Privilege Checks” in Volume 2. 

Related Instructions 

CALL(Far), RET(Near), RET(Far) 
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rFLAGS Affected 

None. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 

X 

X 

X 

The target offset exceeded the code segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Alignment Check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 

Page Fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 
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CALL (Far) Far Procedure Call 

Pushes procedure linking information onto the stack and branches to the target address, which contains 
the first instruction of the called procedure. The operand specifies a target selector and offset. 

The instruction can specify the target directly, by including the far pointer in the immediate and 
displacement fields of the instruction, or indirectly, by referencing a far pointer in memory. In 64-bit 
mode, only indirect far calls are allowed; executing a direct far call (opcode 9A) generates an 
undefined opcode exception. For both direct and indirect far calls, if the CALL (Far) operand-size is 
16 bits, the instruction's operand is a 16-bit offset followed by a 16-bit selector. If the operand-size is 
32 or 64 bits, the operand is a 32-bit offset followed by a 16-bit selector. 

The target selector used by the instruction can be a code selector in all modes. Additionally, the target 
selector can reference a call gate in protected mode, or a task gate or TSS selector in legacy protected 
mode. 

• Target is a code selector —The CS:rIP of the next instruction is pushed to the stack, using operand- 
size stack pushes. Then code is executed from the target CS:rIP. In this case, the target offset can 
only be a 16- or 32-bit value, depending on operand-size, and is zero-extended to 64 bits. No CPL 
change is allowed. 

• Target is a call gate —The call gate specifies the actual target code segment and offset. Call gates 
allow calls to the same or more privileged code. If the target segment is at the same CPL as the 
current code segment, the CS:rIP of the next instruction is pushed to the stack. 

If the CALL (Far) changes privilege level, then a stack-switch occurs, using an inner-level stack 
pointer from the TSS. The CS:rIP of the next instruction is pushed to the new stack. If the mode is 
legacy mode and the param-count field in the call gate is non-zero, then up to 31 operands are 
copied from the caller's stack to the new stack. Finally, the caller's SS:rSP is pushed to the new 
stack. 

When calling through a call gate, the stack pushes are 16-, 32-, or 64-bits, depending on the size of 
the call gate. The size of the target rIP is also 16, 32, or 64 bits, depending on the size of the call 
gate. If the target rIP is less than 64 bits, it is zero-extended to 64 bits. Long mode only allows 64- 
bit call gates that must point to 64-bit code segments. 

• Target is a task gate or a TSS —If the mode is legacy protected mode, then a task switch occurs. 
See “Hardware Task-Management in Legacy Mode” in volume 2 for details about task switches. 
Hardware task switches are not supported in long mode. 

See CALL (Near) for information on near calls—calls to procedures located inside the current code 
segment. For details about control-flow instructions, see “Control Transfers” in Volume 1, and 
“Control-Transfer Privilege Checks” in Volume 2. 
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Mnemonic 

Opcode 

Description 

CALL FAR pntr16:16 

9A cd 

Far call direct, with the target specified by a far pointer 
contained in the instruction. (Invalid in 64-bit mode.) 

CALL FAR pntrl 6:32 

9 A cp 

Far call direct, with the target specified by a far pointer 
contained in the instruction. (Invalid in 64-bit mode.) 

CALL FAR mem16:16 

FF 13 

Far call indirect, with the target specified by a far pointer 
in memory. 

CALL FAR mem 16:32 

FF 13 

Far call indirect, with the target specified by a far pointer 
in memory. 

Action 

// See "Pseudocode 

Definition" 

on page 57. 


CALLF_START: 

IF (REALJMODE) 

CALLF_REAL_OR_VIRTUAL 
ELSIF (PROTECTED_MODE) 

CALLF_PROTECTED 
ELSE // (VIRTUAL_MODE) 

CALLF_REAL_OR_VIRTUAL 

CALLF_REAL_OR_VIRTUAL: 

IF (OPCODE == calif [mem]) // CALLF Indirect 

{ 

temp RIP = READ MEM.z [mem] 
temp_CS = READ MEM.w [mem+Z] 

} 

ELSE // (OPCODE == calif direct) 

{ 

temp RIP = z-sized offset specified in the instruction 
zero-extended to 64 bits 

temp_CS = selector specified in the instruction 

} 

PUSH.v old_CS 
PUSH.v next_RIP 

IF (temp_RIP>CS.limit) 

EXCEPTION [#GP(0)] 

CS.sel = temp_CS 

CS.base = temp_CS SHL 4 

RIP = temp RIP 

EXIT 
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CALLF_PROTECTED: 

IF (OPCODE == calif [mem]) //CALLF Indirect 

{ 

temp_offset = READ MEM.z [mem] 
temp_sel = READ MEM.w [mem+Z] 

} 

ELSE // (OPCODE == calif direct) 

{ 

IF (64BIT_MODE) 

EXCEPTION [#UD] // 'CALLF direct' is illegal in 64-bit mode, 

temp offset = z-sized offset specified in the instruction 
zero-extended to 64 bits 

temp sel = selector specified in the instruction 


temp desc = READ DESCRIPTOR (temp sel, cs chk) 


IF (temp_desc.attr.type == 'available^tss') 

TASK_SWITCH // Using temp_sel as the target TSS selector. 
ELSIF (temp_desc.attr.type == 'taskgate') 

TASK_SWITCH // Using the TSS selector in the task gate 
// as the target TSS. 

ELSIF (temp_desc.attr.type == 'code') 

// If the selector refers to a code descriptor, 
// the offset we read is the target RIP. 


{ 


extended 


then 


temp RIP = temp offset 
CS = temp_desc 
PUSH.v old_CS 
PUSH.v next_RIP 

IF ((!64BIT_MODE) && (temp_RIP 

" // 


EXCEPTION [#GP(0) 


> CS.limit)) 

temp RIP can't be non-canonical because 
// it's a 16- or 32-bit offset, zero- 


// to 64 bits. 


RIP = temp RIP 
EXIT 


} 

ELSE // (temp_desc.attr.type == 'callgate') 

// If the selector refers to a call gate, then 

// the target CS and RIP both come from the call gate. 

{ 


pushes. 


opsize. 


IF (LONG_MODE) 

// The size of the gate controls the size of the stack 


V=8-byte 

// Long mode only uses 64-bit call gates, force 8-byte 

ELSIF (temp_desc.attr.type == 'callgate32') 

V=4-byte 


130 


CALL (Far) 


General-Purpose 
Instruction Reference 



24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


opsize. 


opsize. 


// Legacy mode, using 

ELSE // (temp_desc.attr.type 

V=2-byte 


// Legacy mode, using 


a 32-bit call-gate, 
== ' callgatel6') 
a 16-bit call-gate. 


force 4-byte 


force 2-byte 


upper 


temp RIP = temp desc.offset 

IF (LONG_MODE) // In long mode, we need to read the 2nd half of a 

// 16-byte call-gate from the GDT/LDT, to get the 


{ 


} 


// 32 bits of the target RIP. 

temp upper = READ MEM.q [temp_sel+8] 

IF (temp upper's extended attribute bits != 0) 
EXCEPTION [#GP(temp_sel)] 
temp RIP = tempRIP + (temp upper SHL 32) 

// Concatenate both halves of RIP 


CS = READ DESCRIPTOR (temp desc.segment, clg_chk) 

IF (CS.attr.conforming==l) 
tempjSPL = CPL 
ELSE 

temp_CPL = CS.attr.dpi 


IF (CPL==temp_CPL) 

{ 

PUSH.v old_CS 
PUSH.v next RIP 


IF ((64BIT MODE) && (temp RIP is non-canonical) 
| (!64BIT_MODE) && (temp_RIP > CS.limit)) 

{ 

EXCEPTION[#GP(0)] 

} 


RIP = temp RIP 
EXIT 


ELSE // (CPL != temp CPL), Changing privilege level. 

{ 

CPL = tempjSPL 

temp ist =0 // Call-far doesn't use ist pointers. 

temp_SS_desc:temp_RSP = READ_INNER_LEVEL_STACK_POINTER (CPL, 

temp_ist) 


RSP.q = temp RSP 
SS = temp_SS_desc 
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PUSH.v old_SS // #SS on this and following pushes use 

// SS.sel as error code. 

PUSH.v old_RSP 

IF (LEGACY_MODE) // Legacy-mode call gates have 

{ // a param_count field. 

temp PARAM_COUNT = temp desc.attr.param^count 

FOR (I=temp_PARAM_COUNT; I>0; I—) 

{ 

temp_DATA = READ_MEM.v [old_SS:(old_RSP+I*V)] 

PUSH.v temp_DATA 

} 

} 

PUSH.v old_CS 
PUSH.v next_RIP 

IF ((64BIT MODE) && (temp RIP is non-canonical) 

| (!64BIT_MODE) && (temp_RIP > CS.limit)) 

{ 

EXCEPTION [#GP(0)] 

} 

RIP = temp RIP 
EXIT 



Related Instructions 

CALL (Near), RET (Near), RET (Far) 

rFLAGS Affected 

None, unless a task switch occurs, in which case all flags are modified. 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

The far CALL indirect opcode (FF 13) had a register operand. 



X 

The far CALL direct opcode (9A) was executed in 64-bit mode. 

Invalid TSS, #TS 
(selector) 



X 

As part of a stack switch, the target stack segment selector or 
rSP in the TSS was beyond the TSS limit. 



X 

As part of a stack switch, the target stack segment selector in 
the TSS was a null selector. 



X 

As part of a stack switch, the target stack selector’s Tl bit was 
set, but LDT selector was a null selector. 



X 

As part of a stack switch, the target stack segment selector in 
the TSS was beyond the limit of the GDT or LDT descriptor 
table. 



X 

As part of a stack switch, the target stack segment selector in 
the TSS contained a RPL that was not equal to its DPL. 



X 

As part of a stack switch, the target stack segment selector in 
the TSS contained a DPL that was not equal to the CPL of the 
code segment selector. 



X 

As part of a stack switch, the target stack segment selector in 
the TSS was not a writable segment. 

Segment not 
present, #NP 
(selector) 



X 

The accessed code segment, call gate, task gate, or TSS was 
not present. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical, and no stack switch occurred. 

Stack, #SS 
(selector) 



X 

After a stack switch, a memory access exceeded the stack 
segment limit or was non-canonical. 



X 

As part of a stack switch, the SS register was loaded with a 
non-null segment selector and the segment was marked not 
present. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 

X 

X 

X 

The target offset exceeded the code segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 
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Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 




X 

The target code segment selector was a null selector. 




X 

A code, call gate, task gate, or TSS descriptor exceeded the 
descriptor table limit. 




X 

A segment selector’s Tl bit was set but the LDT selector was a 
null selector. 




X 

The segment descriptor specified by the instruction was not a 
code segment, task gate, call gate or available TSS in legacy 
mode, or not a 64-bit code segment or a 64-bit call gate in long 
mode. 




X 

The RPL of the non-conforming code segment selector 
specified by the instruction was greater than the CPL, or its 
DPL was not equal to the CPL. 

General protection, 
#GP 

(selector) 



X 

The DPL of the conforming code segment descriptor specified 
by the instruction was greater than the CPL. 



X 

The DPL of the callgate, taskgate, or TSS descriptor specified 
by the instruction was less than the CPL, or less than its own 
RPL. 




X 

The segment selector specified by the call gate or task gate 
was a null selector. 




X 

The segment descriptor specified by the call gate was not a 
code segment in legacy mode, or not a 64-bit code segment in 
long mode. 




X 

The DPL of the segment descriptor specified by the call gate 
was greater than the CPL. 




X 

The 64-bit call gate’s extended attribute bits were not zero. 




X 

The TSS descriptor was found in the LDT. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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CBW Convert to Sign-Extended 

CWDE 

CDQE 

Copies the sign bit in the AL or eAX register to the upper bits of the rAX register. The effect of this 
instruction is to convert a signed byte, word, or doubleword in the AL or eAX register into a signed 
word, doubleword, or quadword in the rAX register. This action helps avoid overflow problems in 
signed number arithmetic. 

The CDQE mnemonic is meaningful only in 64-bit mode. 


Mnemonic 

Opcode 

Description 

CBW 

98 

Sign-extend AL into AX. 

CWDE 

98 

Sign-extend AX into EAX. 

CDQE 

98 

Sign-extend EAX into RAX 


Related Instructions 

CWD, CDQ, CQO 

rFLAGS Affected 

None 

Exceptions 

None 


General-Purpose 
Instruction Reference 


CBW, CWDE, CDQE 


135 



AMpg 

AMD64 Technology 


24594 — Rev. 3.28—September 2019 


CWD Convert to Sign-Extended 

CDQ 

CQO 

Copies the sign bit in the rAX register to all bits of the rDX register. The effect of this instruction is to 
convert a signed word, doubleword, or quadword in the rAX register into a signed doubleword, 
quadword, or double-quadword in the rDX:rAX registers. This action helps avoid overflow problems 
in signed number arithmetic. 

The CQO mnemonic is meaningful only in 64-bit mode. 


Mnemonic 

Opcode 

Description 

CWD 

99 

Sign-extend AX into DX:AX. 

CDQ 

99 

Sign-extend EAX into EDX:EAX. 

CQO 

99 

Sign-extend RAX into RDX:RAX 


Related Instructions 

CBW, CWDE, CDQE 

rFLAGS Affected 

None 

Exceptions 

None 
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CLC Clear Carry Flag 

Clears the carry flag (CF) in the rFLAGS register to zero. 

Mnemonic Opcode Description 

CLC F8 Clear the carry flag (CF) to zero. 

Related Instructions 
STC, CMC 
rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 

















0 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 

None 
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CLD Clear Direction Flag 

Clears the direction flag (DF) in the rFLAGS register to zero. If the DF flag is 0, each iteration of a 
string instruction increments the data pointer (index registers rSI or rDI). If the DF flag is 1, the string 
instruction decrements the pointer. Use the CLD instruction before a string instruction to make the 
data pointer increment. 


Mnemonic Opcode Description 

CLD FC Clear the direction flag (DF) to zero. 


Related Instructions 

CMPSx, INSx, LODSx, MOVSx, OUTSx, SCASx, STD, STOSx 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 










0 








21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 

None 
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CLFLUSH Cache Line Flush 

Flushes the cache line specified by the mem8 linear-address. The instruction checks all levels of the 
cache hierarchy—internal caches and external caches—and invalidates the cache line in every cache 
in which it is found. If a cache contains a dirty copy of the cache line (that is, the cache line is in the 
modified or owned MOESI state), the line is written back to memory before it is invalidated. The 
instruction sets the cache-line MOESI state to invalid. 

The instruction also checks the physical address corresponding to the linear-address operand against 
the processor’s write-combining buffers. If the write-combining buffer holds data intended for that 
physical address, the instruction writes the entire contents of the buffer to memory. This occurs even 
though the data is not cached in the cache hierarchy. In a multiprocessor system, the instruction checks 
the write-combining buffers only on the processor that executed the CLFLUSH instruction. 

On processors that do not support the CLFLUSHOPT instruction, CPUID Fn 
0000_0007_EBX_x0[CLFLSHOPT]=0, the CLFLUSH instruction is weakly ordered with respect to 
other instructions that operate on memory. Speculative loads initiated by the processor, or specified 
explicitly using cache-prefetch instructions, can be reordered around a CLFLUSH instruction. Such 
reordering can invalidate a speculatively prefetched cache line, unintentionally defeating the prefetch 
operation. The only way to avoid this situation is to use the MFENCE instruction after the CLFLUSH 
instruction to force strong-ordering of the CLFLUSH instruction with respect to subsequent memory 
operations. The CLFLUSH instruction may also take effect on a cache line while stores from previous 
store instructions are still pending in the store buffer. To ensure that such stores are included in the 
cache line that is flushed, use an MFENCE instruction ahead of the CLFLUSH instruction. Such 
stores would otherwise cause the line to be re-cached and modified after the CLFLUSH completed. 
The LFENCE, SFENCE, and serializing instructions are not ordered with respect to CLFLUSH. 

On processors that support CLFLUSOPT, CPUID Fn 0000_0007_EBX_x0[CLFLSHOPT]=l, 
CLFLUSH is ordered with respect to locked operations, fence instructions, and CLFLUSHOPT, 
CLFLUSH and write instructions that touch the same cache line. CLFLUSH is not ordered with 
CLFLUSHOPT, CLFLUSH and write instructions to other cache lines. 

The CLFLUSH instruction behaves like a load instruction with respect to setting the page-table 
accessed and dirty bits. That is, it sets the page-table accessed bit to 1, but does not set the page-table 
dirty bit. 

The CLFLUSH instruction executes at any privilege level. CLFLUSH perfonns all the segmentation 
and paging checks that a 1-byte read would perform, except that it also allows references to execute- 
only segments. 

The CLFLUSH instruction is supported if the feature flag CPUID Fn0000_0001_EDX[CLFSH] is set. 
The 8-bit field CPUID Fn 0000_0001_EBX[CLFlush] returns the size of the cacheline in quadwords. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 
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Mnemonic Opcode Description 

CLFLUSH mem8 OF AE /7 flush cache line containing mem8. 

Related Instructions 

INVD, WBINVD, CLFLUSHOPT, CLZERO 

rFLAGS Affected 

None 


Exceptions 


Exception (vector) 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

CLFLUSH instruction is not supported, as indicated by 
CPUID Fn0000_0001_EDX[CLFSH] = 0. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit 
or was non-canonical. 

General protection, 

#GP 

X 

X 

X 

A memory address exceeded a data segment limit or 
was non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the 
instruction. 
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CLFLUSHOPT Optimized Cache Line Flush 

Flushes the cache line specified by the mem8 linear-address. The instruction checks all levels of the 
cache hierarchy-internal caches and external caches-and invalidates the cache line in every cache in 
which it is found. If a cache contains a dirty copy of the cache line (that is, the cache line is in the 
modified or owned MOESI state), the line is written back to memory before it is invalidated. The 
instruction sets the cache-line MOESI state to invalid. 

The instruction also checks the physical address corresponding to the linear-address operand against 
the processor’s write-combining buffers. If the write-combining buffer holds data intended for that 
physical address, the instruction writes the entire contents of the buffer to memory. This occurs even 
though the data is not cached in the cache hierarchy. In a multiprocessor system, the instruction checks 
the write-combining buffers only on the processor that executed the CLFLUSHOPT instruction. 

The CLFLUSHOPT instruction is ordered with respect to fence instructions and locked operations. 
CLFLUSHOPT is also ordered with writes, CLFLUSH, and CLFLUSHOPT instructions that 
reference the same cache line as the CLFLUSHOPT. CLFLUSHOPT is not ordered with writes, 
CLFLUSH, and CLFLUSHOPT to other cache lines. To enforce ordering in that situation, a SFENCE 
instruction or stronger should be used. 

Speculative loads initiated by the processor, or specified explicitly using cache-prefetch instructions, 
can be reordered around a CLFLUSHOPT instruction. Such reordering can invalidate a speculatively 
prefetched cache line, unintentionally defeating the prefetch operation. 

The only way to avoid this situation is to use the MFENCE instruction after the CLFLUSHOPT 
instruction to force strong ordering of the CLFLUSHOPT instruction with respect to subsequent 
memory operations. 

The CLFLUSHOPT instruction behaves like a load instruction with respect to setting the page-table 
accessed and dirty bits. That is, it sets the page-table accessed bit to 1, but does not set the page-table 
dirty bit. 

The CLFLUSHOPT instruction executes at any privilege level. CLFLUSHOPT performs all the 
segmentation and paging checks that a 1-byte read would perform, except that it also allows references 
to execute-only segments. 

The CLFLUSHOPT instruction is supported if the feature flag CPUID 
Fn0000_0007_EBX_x0[CLFSHOPT]is set. The 8-bit field CPUID Fn 0000_0001_EBX[CLFlush] 
returns the size of the cacheline in quadwords. 

Mnemonic Opcode Description 

CLFLUSHOPT mem8 66 0FAE/7 Flush cache line containing mem8 
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Related Instructions 

CLFLUSH 

rFLAGS Affected 

None 

Exceptions 


Exception (vector) 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

CLFLUSH instruction is not supported, as indicated by 
CPUID Fn0000_0001_EDX[CLFSH] = 0. 

X 

X 

X 

Instruction not supported by CPUID 
FnOOOO_OOC)7_EBX_xO[CLFLUSHOPT] = 0 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit 
or was non-canonical. 

General protection, 

#GP 

X 

X 

X 

A memory address exceeded a data segment limit or 
was non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the 
instruction. 
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CLWB Cache Line Write Back and Retain 

Flushes the cache line specified by the mem.8 linear address. The instruction checks all levels of the 
cache hierarchy—internal caches and external caches—and causes the cache line, if dirty, to be written 
to memory. The cache line may be retained in the cache where found in a non-dirty state. 

The CLWB instruction is weakly ordered with respect to other instructions that operate on memory. 
Speculative loads initiated by the processor, or specified explicitly using cache prefetch instructions, 
can be reordered around a CLWB instruction. CLWB is ordered naturally with older stores to the same 
address on the same logical processor. To create strict ordering of CLWB use a store-ordering 
instruction such as SFENCE. 

The CLWB instruction behaves like a load instruction with respect to setting the page table accessed 
and dirty bits. That is, it sets the page table accessed bit to 1, but does not set the page table dirty bit. 

The CLWB instruction executes at any privilege level. CLWB performs all the segmentation and 
paging checks that a 1 -byte read would perform, except that it also allows references to execute only 
segments. 

The CLWB instruction is supported if the feature flag CPUID Fn0000_0007 EBX[24]=1. 

The 8-bit field CPUID Fn 0000_0001_EBX[CLFlush] returns the size of the cacheline in quadwords. 


Mnemonic Opcode Description 

CLWB 66 OF AE /6 Cache line write-back. 

Related Instructions 

CLFLUSH, CLFLUSHOPT, WBINVD, WBNOINVD 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 


















21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception (vector) 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

Instruction not supported by CPUID 
Fn0000_0007_EBX[24] = 0 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit 
or was non-canonical. 
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Exception (vector) 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

General protection, 

#GP 

X 

X 

X 

A memory address exceeded a data segment limit or 
was non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the 
instruction. 
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CLZERO Zero Cache Line 

Clears the cache line specified by the logical address in rAX by writing a zero to every byte in the line. 
The instruction uses an implied non temporal memory type, similar to a streaming store, and uses the 
write combining protocol to minimize cache pollution. 

CLZERO is weakly-ordered with respect to other instructions that operate on memory. Software 
should use an SFENCE or stronger to enforce memory ordering of CLZERO with respect to other 
store instructions. 

The CLZERO instruction executes at any privilege level. CLZERO performs all the segmentation and 
paging checks that a store of the specified cache line would perform. 

The CLZERO instruction is supported if the feature flag CPUID Fn8000_0008_EBX[CLZERO] is 
set. The 8-bit field CPUID Fn 0000_0001_EBX[CLFlush] returns the size of the cacheline in 
quadwords. 

Mnemonic Opcode Description 

CLZERO rAX OF 01 FC Clears cache line containing rAX 


Related Instructions 

CLFLUSH 

rFLAGS Affected 

None 


Exceptions 


Exception (vector) 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

Instruction not supported by CPUID 
Fn8000_0008_EBX[CLZERO] = 0 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit 
or was non-canonical. 

General protection, 

+tn.p 

X 

X 

X 

A memory address exceeded a data segment limit or 
was non-canonical. 




X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the 
instruction. 
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CMC Complement Carry Flag 

Complements (toggles) the carry flag (CF) bit of the rFLAGS register. 

Mnemonic Opcode Description 

CMC F5 Complement the carry flag (CF). 

Related Instructions 
CLC, STC 
rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 

















M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 

None 
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CMOVcc Conditional Move 

Conditionally moves a 16-bit, 32-bit, or 64-bit value in memory or a general-purpose register (second 
operand) into a register (first operand), depending upon the settings of condition flags in the rFLAGS 
register. If the condition is not satisfied, the destination register is not modified. For the memory-based 
fonns of CMOVcc, memory-related exceptions may be reported even if the condition is false. In 64-bit 
mode, CMOVcc with a 32-bit operand size will clear the upper 32 bits of the destination register even 
if the condition is false. 

The mnemonics of CMOVcc instructions denote the condition that must be satisfied. Most assemblers 
provide instruction mnemonics with A (above) and B (below) tags to supply the semantics for 
manipulating unsigned integers. Those with G (greater than) and L (less than) tags deal with signed 
integers. Many opcodes may be represented by synonymous mnemonics. For example, the CMOVL 
instruction is synonymous with the CMOVNGE instruction and denote the instruction with the opcode 
OF 4C. 

The feature flag CPUID FnOOOOOOOlEDXfCMOV] or CPUID Fn8000_0001_EDX[CMOV] =1 
indicates support for CMOVcc instructions on a particular processor implementation. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic 

Opcode 

Description 

CMOVO reg16, reg/mem16 
CMOVO reg32, reg/mem32 
CMOVO reg64, reg/mem64 

OF 40 /r 

Move if overflow (OF = 1). 

CMOVNO reg16, reg/mem16 
CMOVNO reg32, reg/mem32 
CMOVNO reg64, reg/mem64 

OF 41 /r 

Move if not overflow (OF = 0). 

CMOVB reg16, reg/mem16 
CMOVB reg32, reg/mem32 
CMOVB reg64, reg/mem64 

OF 42 /r 

Move if below (CF = 1). 

CMOVC reg16, reg/mem16 
CMOVC reg32, reg/mem32 
CMOVC reg64, reg/mem64 

OF 42 /r 

Move if carry (CF = 1). 

CMOVNAE reg16, reg/mem16 
CMOVNAE reg32, reg/mem32 
CMOVNAE reg64 , reg/mem64 

OF 42 /r 

Move if not above or equal (CF = 1) 

CMOVNB reg16,reg/mem16 
CMOVNB reg32,reg/mem32 
CMOVNB reg64,reg/mem64 

OF 43 /r 

Move if not below (CF = 0). 

CMOVNC reg16,reg/mem16 
CMOVNC reg32,reg/mem32 
CMOVNC reg64,reg/mem64 

OF 43 /r 

Move if not carry (CF = 0). 
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Mnemonic 

Opcode 

Description 

CMOVAE reg16, reg/mem16 
CMOVAE reg32, reg/mem32 
CMOVAE reg64, reg/mem64 

OF 43 /r 

Move if above or equal (CF = 0). 

CMOVZ reg16, reg/mem16 
CMOVZ reg32, reg/mem32 
CMOVZ reg64, reg/mem64 

OF 44 /r 

Move if zero (ZF = 1). 

CMOVE reg16, reg/mem16 
CMOVE reg32, reg/mem32 
CMOVE reg64, reg/mem64 

OF 44 /r 

Move if equal (ZF =1). 

CMOVNZ reg16 , reg/mem16 
CMOVNZ reg32, reg/mem32 
CMOVNZ reg64, reg/mem64 

OF 45 /r 

Move if not zero (ZF = 0). 

CMOVNE reg16, reg/mem16 
CMOVNE reg32, reg/mem32 
CMOVNE reg64, reg/mem64 

OF 45 /r 

Move if not equal (ZF = 0). 

CMOVBE reg16, reg/mem16 
CMOVBE reg32, reg/mem32 
CMOVBE reg64, reg/mem64 

OF 46 /r 

Move if below or equal (CF = 1 or ZF = 1). 

CMOVNA reg16, reg/mem16 
CMOVNA reg32, reg/mem32 
CMOVNA reg64, reg/mem64 

OF 46 /r 

Move if not above (CF = 1 or ZF = 1). 

CMOVNBE reg16, reg/mem16 
CMOVNBE reg32,reg/mem32 
CMOVNBE reg64,reg/mem64 

OF 47 /r 

Move if not below or equal (CF = 0 and ZF = 0) 

CMOVA reg16, reg/mem16 
CMOVA reg32, reg/mem32 
CMOVA reg64, reg/mem64 

OF 47 /r 

Move if above (CF = 0 and ZF = 0). 

CMOVS reg16, reg/mem16 
CMOVS reg32, reg/mem32 
CMOVS reg64, reg/mem64 

OF 48 /r 

Move if sign (SF =1). 

CMOVNS reg16, reg/mem16 
CMOVNS reg32, reg/mem32 
CMOVNS reg64, reg/mem64 

OF 49 /r 

Move if not sign (SF = 0). 

CMOVP reg16, reg/mem16 
CMOVP reg32, reg/mem32 
CMOVP reg64, reg/mem64 

0F4A /r 

Move if parity (PF = 1). 

CMOVPE reg16, reg/mem16 
CMOVPE reg32, reg/mem32 
CMOVPE reg64, reg/mem64 

0F4A /r 

Move if parity even (PF = 1). 

CMOVNP reg16, reg/mem16 
CMOVNP reg32, reg/mem32 
CMOVNP reg64, reg/mem64 

OF 4B /r 

Move if not parity (PF = 0). 

CMOVPO reg16, reg/mem16 
CMOVPO reg32, reg/mem32 
CMOVPO reg64, reg/mem64 

OF 4B /r 

Move if parity odd (PF = 0). 
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Mnemonic 

Opcode 

Description 

CMOVL reg16, reg/mem16 
CMOVL reg32, reg/mem32 
CMOVL reg64, reg/mem64 

OF 4C /r 

Move if less (SF <> OF). 

CMOVNGE reg16, reg/mem16 
CMOVNGE reg32, reg/mem32 
CMOVNGE reg64, reg/mem64 

OF 4C /r 

Move if not greater or equal (SF <> OF). 

CMOVNL reg16, reg/mem16 
CMOVNL reg32, reg/mem32 
CMOVNL reg64, reg/mem64 

OF 4D /r 

Move if not less (SF = OF). 

CMOVGE reg16, reg/mem16 
CMOVGE reg32, reg/mem32 
CMOVGE reg64, reg/mem64 

OF 4D /r 

Move if greater or equal (SF = OF). 

CMOVLE reg16, reg/mem16 
CMOVLE reg32, reg/mem32 
CMOVLE reg64, reg/mem64 

OF 4E /r 

Move if less or equal (ZF = 1 or SF <> OF). 

CMOVNG reg16, reg/mem16 
CMOVNG reg32, reg/mem32 
CMOVNG reg64, reg/mem64 

OF 4E /r 

Move if not greater (ZF = 1 or SF <> OF). 

CMOVNLE reg16, reg/mem16 
CMOVNLE reg32, reg/mem32 
CMOVNLE reg64, reg/mem64 

OF 4F /r 

Move if not less or equal (ZF = 0 and SF = OF) 

CMOVG reg16, reg/mem16 
CMOVG reg32, reg/mem32 
CMOVG reg64, reg/mem64 

OF 4F /r 

Move if greater (ZF = 0 and SF = OF). 


Related Instructions 

MOV 


rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

CMOVcc instruction is not supported, as indicated by CPUID 
FnOOOO 0001 EDX[CMOV] or Fn8000 0001 EDX[CMOV] = 

0 . 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 
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Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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CMP Compare 

Compares the contents of a register or memory location (first operand) with an immediate value or the 
contents of a register or memory location (second operand), and sets or clears the status flags in the 
rFLAGS register to reflect the results. To perform the comparison, the instruction subtracts the second 
operand from the first operand and sets the status flags in the same manner as the SUB instruction, but 
does not alter the first operand. If the second operand is an immediate value, the instruction sign- 
extends the value to the length of the first operand. 

Use the CMP instruction to set the condition codes for a subsequent conditional jump (J cc), 
conditional move (CMOVcc), or conditional SETcc instruction. Appendix F, “Instruction Effects on 
RFLAGS” shows how instructions affect the rFLAGS status flags. 


Mnemonic 

Opcode 

Description 

CMP AL, imm8 

3C ib 

Compare an 8-bit immediate value with the contents of 
the AL register. 

CMP AX, imm16 

3D iw 

Compare a 16-bit immediate value with the contents of 
the AX register. 

CMP EAX, imm32 

3D id 

Compare a 32-bit immediate value with the contents of 
the EAX register. 

CMP RAX, imm32 

3D id 

Compare a 32-bit immediate value with the contents of 
the RAX register. 

CMP reg/mem8, imm8 

80 17 ib 

Compare an 8-bit immediate value with the contents of 
an 8-bit register or memory operand. 

CMP reg/mem16, imm16 

81 17 iw 

Compare a 16-bit immediate value with the contents of a 
16-bit register or memory operand. 

CMP reg/mem32, imm32 

81 17 id 

Compare a 32-bit immediate value with the contents of a 
32-bit register or memory operand. 

CMP reg/mem64, imm32 

81 17 id 

Compare a 32-bit signed immediate value with the 
contents of a 64-bit register or memory operand. 

CMP reg/mem16, imm8 

83 17 ib 

Compare an 8-bit signed immediate value with the 
contents of a 16-bit register or memory operand. 

CMP reg/mem32, imm8 

83 17 ib 

Compare an 8-bit signed immediate value with the 
contents of a 32-bit register or memory operand. 

CMP reg/mem64, imm8 

83 17 ib 

Compare an 8-bit signed immediate value with the 
contents of a 64-bit register or memory operand. 

CMP reg/mem8, reg8 

38/r 

Compare the contents of an 8-bit register or memory 
operand with the contents of an 8-bit register. 

CMP reg/mem16 , reg16 

39 /r 

Compare the contents of a 16-bit register or memory 
operand with the contents of a 16-bit register. 

CMP reg/mem32, reg32 

39 /r 

Compare the contents of a 32-bit register or memory 
operand with the contents of a 32-bit register. 

CMP reg/mem64, reg64 

39 /r 

Compare the contents of a 64-bit register or memory 
operand with the contents of a 64-bit register. 
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Mnemonic 

Opcode 

CMP reg8, reg/mem8 

3 A /r 

CMP reg16, reg/mem16 

3B/r 

CMP reg32, reg/mem32 

3B/r 

CMP reg64 , reg/mem64 

3B /r 


24594 — Rev. 3.28—September 2019 


Description 

Compare the contents of an 8-bit register with the 
contents of an 8-bit register or memory operand. 

Compare the contents of a 16-bit register with the 
contents of a 16-bit register or memory operand. 

Compare the contents of a 32-bit register with the 
contents of a 32-bit register or memory operand. 

Compare the contents of a 64-bit register with the 
contents of a 64-bit register or memory operand. 


When interpreting operands as unsigned, flag settings are as follows: 


Operands 

CF 

ZF 

dest > source 

0 

0 

dest = source 

0 

1 

dest < source 

1 

0 


When interpreting operands as signed, flag settings are as follows: 


Operands 

OF 

ZF 

dest > source 

SF 

0 

dest = source 

0 

1 

dest < source 

NOT SF 

0 


Related Instructions 

SUB, CMPSx, SCASx 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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CMPS Compare Strings 

CMPSB 

CMPSW 

CMPSD 

CMPSQ 

Compares the bytes, words, doublewords, or quadwords pointed to by the rSI and rDI registers, sets or 
clears the status flags of the rFLAGS register to reflect the results, and then increments or decrements 
the rSI and rDI registers according to the state of the DF flag in the rFLAGS register. To perform the 
comparison, the instruction subtracts the second operand from the first operand and sets the status 
flags in the same manner as the SUB instruction, but does not alter the first operand. The two operands 
must be the same size. 

If the DF flag is 0, the instruction increments rSI and rDI; otherwise, it decrements the pointers. It 
increments or decrements the pointers by 1,2, 4, or 8, depending on the size of the operands. 

The forms of the CMPSx instruction with explicit operands address the first operand at seg:[rSI]. The 
value of seg defaults to the DS segment, but may be overridden by a segment prefix. These instructions 
always address the second operand at ES:[rDI], ES may not be overridden. The explicit operands serve 
only to specify the type (size) of the values being compared and the segment used by the first operand. 

The no-operands forms of the instruction use the DS:[rSI] and ES:[rDI] registers to point to the values 
to be compared. The mnemonic detennines the size of the operands. 

Do not confuse this CMPSD instruction with the same-mnemonic CMPSD (compare scalar double¬ 
precision floating-point) instruction in the 128-bit media instruction set. Assemblers can distinguish 
the instructions by the number and type of operands. 

For block comparisons, the CMPS instruction supports the REPE or REPZ prefixes (they are 
synonyms) and the REPNE or REPNZ prefixes (they are synonyms). For details about the REP 
prefixes, see “Repeat Prefixes” on page 12. If a conditional jump instruction like JL follows a CMPSx 
instruction, the jump occurs if the value of the seg:[rSI] operand is less than the ES:[rDI] operand. This 
action allows lexicographical comparisons of string or array elements. A CMPSx instruction can also 
operate inside a loop controlled by the LOOPcc instruction. 


Mnemonic Opcode 

CMPS mem8, mem8 A6 

CMPS mem 16, mem 16 A7 

CMPS mem32, mem32 A7 

CMPS mem64, mem64 A7 


Description 

Compare the byte at DS:rSI with the byte at ES:rDI and 
then increment or decrement rSI and rDI. 

Compare the word at DS:rSI with the word at ES:rDI and 
then increment or decrement rSI and rDI. 

Compare the doubleword at DS:rSI with the doubleword 
at ES:rDI and then increment or decrement rSI and rDI. 

Compare the quadword at DS:rSI with the quadword at 
ES:rDI and then increment or decrement rSI and rDI. 
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Mnemonic 

Opc 

CMPSB 

A6 

CMPSW 

A7 

CMPSD 

A7 

CMPSQ 

A7 


Description 

Compare the byte at DS:rSI with the byte at ES:rDI and 
then increment or decrement rSI and rDI. 

Compare the word at DS:rSI with the word at ES:rDI and 
then increment or decrement rSI and rDI. 

Compare the doubleword at DS:rSI with the doubleword 
at ES:rDI and then increment or decrement rSI and rDI. 

Compare the quadword at DS:rSI with the quadword at 
ES:rDI and then increment or decrement rSI and rDI. 


Related Instructions 

CMP, SCASx 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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CMPXCHG Compare and Exchange 

Compares the value in the AL, AX, EAX, or RAX register with the value in a register or a memory 
location (first operand). If the two values are equal, the instruction copies the value in the second 
operand to the first operand and sets the ZF flag in the rFLAGS register to 1. Otherwise, it copies the 
value in the first operand to the AL, AX, EAX, or RAX register and clears the ZF flag to 0. 

The OF, SF, AF, PF, and CF flags are set to reflect the results of the compare. 

When the first operand is a memory operand, CMPXCHG always does a read-modify-write on the 
memory operand. If the compared operands were unequal, CMPXCHG writes the same value to the 
memory operand that was read. 

The forms of the CMPXCHG instruction that write to memory support the LOCK prefix. For details 
about the LOCK prefix, see “Lock Prefix” on page 11. 


Mnemonic 

Opcode 

Description 

CMPXCHG reg/mem8, reg8 

OF BO /r 

Compare AL register with an 8-bit register or memory 
location. If equal, copy the second operand to the first 
operand. Otherwise, copy the first operand to AL. 

CMPXCHG reg/mem16, reg16 

OF B1 /r 

Compare AX register with a 16-bit register or memory 
location. If equal, copy the second operand to the first 
operand. Otherwise, copy the first operand to AX. 

CMPXCHG reg/mem32, reg32 

OF B1 /r 

Compare EAX register with a 32-bit register or memory 
location. If equal, copy the second operand to the first 
operand. Otherwise, copy the first operand to EAX. 

CMPXCHG reg/mem64, reg64 

OF B1 /r 

Compare RAX register with a 64-bit register or memory 
location. If equal, copy the second operand to the first 
operand. Otherwise, copy the first operand to RAX. 


Related Instructions 

CMPXCHG8B, CMPXCHG 16B 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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CMPXCHG8B Compare and Exchange Eight Bytes 

CMPXCHG16B Compare and Exchange Sixteen Bytes 

Compares the value in the rDXirAX registers with a 64-bit or 128-bit value in the specified memory 
location. If the values are equal, the instruction copies the value in the rCXirBX registers to the 
memory location and sets the zero flag (ZF) of the rFLAGS register to 1. Otherwise, it copies the value 
in memory to the rDXirAX registers and clears ZF to 0. 

If the effective operand size is 16-bit or 32-bit, the CMPXCHG8B instruction is used. This instruction 
uses the EDXiEAX and ECXiEBX register operands and a 64-bit memory operand. If the effective 
operand size is 64-bit, the CMPXCHG16B instruction is used; this instruction uses RDXiRAX register 
operands and a 128-bit memory operand. 

The CMPXCHG8B and CMPXCHG 16B instructions always do a read-modify-write on the memory 
operand. If the compared operands were unequal, the instructions write the same value to the memory 
operand that was read. 

The CMPXCHG8B and CMPXCHG 16B instructions support the LOCK prefix. For details about the 
LOCK prefix, see “Lock Prefix” on page 11. 

Support for the CMPXCHG8B and CMPXCHG 16B instructions is implementation dependent. 
Support for the CMPXCHG8B instruction is indicated by CPUID 
FnOOOOOOO 1_EDX[CMPXCHG8B] or Fn8000_0001_EDX[CMPXCHG8B] = 1. Support for the 
CMPXCHG 16B instruction is indicated by CPUID Fn0000_0001_ECX[CMPXCHG16B] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 

The memory operand used by CMPXCHG 16B must be 16-byte aligned or else a general-protection 
exception is generated. 

Mnemonic Opcode Description 

Compare EDXiEAX register to 64-bit memory location. 

If equal, set the zero flag (ZF) to 1 and copy the 

CMPXCHG8B mem64 OF C7 /I m64 ECXiEBX register to the memory location. Otherwise, 

copy the memory location to EDXiEAX and clear the 
zero flag. 

Compare RDXiRAX register to 128-bit memory location. 

If equal, set the zero flag (ZF) to 1 and copy the 
RCXiRBX register to the memory location. Otherwise, 
copy the memory location to RDXiRAX and clear the 
zero flag. 

Related Instructions 

CMPXCHG 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 














M 




21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

CMPXCHG8B instruction is not supported, as indicated by 
CPUID FnOOOO 0001 EDX[CMPXCHG8B] or 
Fn8000_0001_EDX[CMPXCHG8B] = 0. 



X 

CMPXCHG16B instruction is not supported, as indicated by 
CPUID Fn0000_0001_ECX[CMPXCHG16B] = 0. 

X 

X 

X 

The operand was a register. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 



X 

The memory operand for CMPXCHG16B was not aligned on a 
16-byte boundary. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 


General-Purpose 
Instruction Reference 


CMPXCHG8/16B 


159 








AMpg 

AMD64 Technology 


24594 — Rev. 3.28—September 2019 


CPUID Processor Identification 

Provides information about the processor and its capabilities through a number of different functions. 
Software should load the number of the CPUID function to execute into the EAX register before 
executing the CPUID instruction. The processor returns information in the EAX, EBX, ECX, and 
EDX registers; the contents and fonnat of these registers depend on the function. 

The architecture supports CPUID information about standard functions and extended functions. The 
standard functions have numbers in the OOOO xxxxh series (for example, standard function 1). To 
determine the largest standard function number that a processor supports, execute CPUID function 0. 

The extended functions have numbers in the 8000_xxxxh series (for example, extended 
function 8000_0001h). To determine the largest extended function number that a processor supports, 
execute CPUID extended function 8000_0000h. If the value returned in EAX is greater than 
8000_0000h, the processor supports extended functions. 

Software operating at any privilege level can execute the CPUID instruction to collect this 
information. In 64-bit mode, this instruction works the same as in legacy mode except that it zero- 
extends 32-bit register results to 64 bits. 

CPUID is a serializing instruction. 

Mnemonic Opcode Description 

Returns information about the processor and its 

CPUID 0FA2 capabilities. EAX specifies the function number, and the 

data is returned in EAX, EBX, ECX, EDX. 

Testing for the CPUID Instruction 

To avoid an invalid-opcode exception (#UD) on those processor implementations that do not support 
the CPUID instruction, software must first test to determine if the CPUID instruction is supported. 
Support for the CPUID instruction is indicated by the ability to write the ID bit in the rFLAGS register. 
Normally, 32-bit software uses the PUSHFD and POPFD instructions in an attempt to write 
rFLAGS.ID. After reading the updated rFLAGS.ID bit, a comparison determines if the operation 
changed its value. If the value changed, the processor executing the code supports the CPUID 
instruction. If the value did not change, rFLAGS.ID is not writable, and the processor does not support 
the CPUID instruction. 

The following code sample shows how to test for the presence of the CPUID instruction using 32-bit 
code. 


pushfd 


r 

save EFLAGS 


pop 

eax 

r 

store EFLAGS in 

EAX 

mov 

ebx, eax 

r 

save in EBX for 

later testing 

xor 

eax, 00200000h 

r 

toggle bit 21 


push 

eax 

r 

push to stack 


popf d 


r 

save changed EAX 

to EFLAGS 
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pushfd 
pop eax 

cmp eax, ebx 

jz NO_CPUID 


; push EFLAGS to TOS 
; store EFLAGS in EAX 
; see if bit 21 has changed 
; if no change, no CPUID 


Standard Function 0 and Extended Function 8000_0000h 

CPUID standard function 0 loads the EAX register with the largest CPUID standard function number 
supported by the processor implementation; similarly, CPUID extended function 8000_0000h loads 
the EAX register with the largest extended function number supported. 

Standard function 0 and extended function 8000_0000h both load a 12-character string into the EBX, 
EDX, and ECX registers identifying the processor vendor. For AMD processors, the string is 
Authent icAMD. This string infonns software that it should follow the AMD CPUID definition for 
subsequent CPUID function calls. If the function returns another vendor’s string, software must use 
that vendor’s CPUID definition when interpreting the results of subsequent CPUID function calls. 
Table 3-2 shows the contents of the EBX, EDX, and ECX registers after executing function 0 on an 
AMD processor. 


Table 3-2. Processor Vendor Return Values 


Register 

Return Value 

ASCII Characters 

EBX 

6874_7541h 

"h t u A" 

EDX 

6974_6E65h 

"i t n e" 

ECX 

444D_4163h 

"D M A c" 


For a description of all feature flags related to instruction subset support, see Appendix D, “Instruction 
Subsets and CPUID Feature Flags,” on page 537. For a description of all defined feature numbers and 
return values, see Appendix E, “Obtaining Processor Information Via the CPUID Instruction,” on 
page 607. 

Related Instructions 

None 

rFLAGS Affected 

None 

Exceptions 

None 
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CRC32 CRC32 Cyclical Redundancy Check 

Performs one step of a 32-bit cyclic redundancy check. 

The first source, which is also the destination, is a doubleword value in either a 32-bit or 64-bit GPR 
depending on the presence of a REX prefix and the value of the REX.W bit. The second source is a 
GPR or memory location of width 8, 16, or 32 bits. A vector of width 40, 48, or 64 bits is derived from 
the two operands as follows: 

1. The low-order 32 bits of the first operand is bit-wise inverted and shifted left by the width of the 
second operand. 

2. The second operand is bit-wise inverted and shifted left by 32 bits 

3. The results of steps 1 and 2 are xored. 

This vector is interpreted as a polynomial of degree 40,48, or 64 over the field of two elements (i.e., bit 
i is interpreted as the coefficient of X A i). This polynomial is divided by the polynomial of degree 32 
that is similarly represented by the vector llEDC6F41h. (The division admits an efficient iterative 
implementation based on the xor operation.) The remainder is encoded as a 32-bit vector, which is 
bit-wise inverted and written to the destination. In the case of a 64-bit destination, the upper 32 bits are 
cleared. 

In an application of the CRC algorithm, a data block is partitioned into byte, word, or doubleword 
segments and CRC32 is executed iteratively, once for each segment. 

CRC32 is a SSE4.2 instruction. Support for SSE4.2 instructions is indicated by CPUID 
Fn0000_0001_ECX[SSE42] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 

Instruction Encoding 


Mnemonic 

Encoding 

Notes 

CRC32 reg32, reg/mem8 

F2 OF 38 FO/r 

Perform CRC32 operation on 8-bit values 

CRC32 reg32, reg/mem8 

F2 REX OF 38 FO/r 

Encoding using REX prefix allows access to 
GPR8-15 

CRC32 reg32, reg/mem16 

F2 OF 38 FI It 

Effective operand size determines size of second 

CRC32 reg32, reg/mem32 

F2 OF 38 FI /r 

operand. 

CRC32 reg64, reg/mem8 

F2 REX.W OF 38 F0 /r 

REX.W = 1. 

CRC32 reg64, reg/mem64 

F2 REX.W OF 38 FI /r 

REX.W = 1. 
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rFLAGS Affected 

None 


Exceptions 


Exception 

Mode 

Cause of Exception 

Real 

Virtual 

8086 

Protected 

Invalid opcode, 

#UD 

X 

X 

X 

Lock prefix used 

X 

X 

X 

SSE42 instructions are not supported as indicated by CPUID 
Fn0000_0001_ECX[SSE42] = 0. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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DAA Decimal Adjust after Addition 

Adjusts the value in the AL register into a packed BCD result and sets the CF and AF flags in the 
rFLAGS register to indicate a decimal carry out of either nibble of AL. 

Use this instruction to adjust the result of a byte ADD instruction that performed the binary addition of 
one 2-digit packed BCD values to another. 

The instruction perfonns the adjustment by adding 06h to AL if the lower nibble is greater than 9 or if 
AF = 1. Then 60h is added to AL if the original AL was greater than 99h or if CF = 1. 

If the lower nibble of AL was adjusted, the AF flag is set to 1. Otherwise AF is not modified. If the 
upper nibble of AL was adjusted, the CF flag is set to 1. Otherwise, CF is not modified. SF, ZF, and PF 
are set according to the final value of AL. 

Using this instruction in 64-bit mode generates an invalid-opcode (#UD) exception. 


Mnemonic 


Opcode Description 


DAA 


Decimal adjust AL. 
(Invalid in 64-bit mode.) 


rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









U 




M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 



X 

This instruction was executed in 64-bit mode. 
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DAS Decimal Adjust after Subtraction 

Adjusts the value in the AL register into a packed BCD result and sets the CF and AF flags in the 
rFLAGS register to indicate a decimal borrow. 

Use this instruction to adjust the result of a byte SUB instruction that performed a binary subtraction of 
one 2-digit, packed BCD value from another. 

This instruction perfonns the adjustment by subtracting 06h from AL if the lower nibble is greater than 
9 or if AF = 1. Then 60h is subtracted from AL if the original AL was greater than 99h or if CF = 1. 

If the adjustment changes the lower nibble of AL, the AF flag is set to 1; otherwise AF is not modified. 
If the adjustment results in a borrow for either nibble of AL, the CF flag is set to 1; otherwise CF is not 
modified. The SF, ZF, and PF flags are set according to the final value of AL. 

Using this instruction in 64-bit mode generates an invalid-opcode (#UD) exception. 


Mnemonic Opcode Description 

n « q 0F Decimal adjusts AL after subtraction. 

(Invalid in 64-bit mode.) 


Related Instructions 

DAA 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









U 




M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 



X 

This instruction was executed in 64-bit mode. 
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DEC Decrement by 1 

Subtracts 1 from the specified register or memory location. The CF flag is not affected. 

The one-byte forms of this instruction (opcodes 48 through 4F) are used as REX prefixes in 64-bit 
mode. See “REX Prefix” on page 14. 

The forms of the DEC instruction that write to memory support the LOCK prefix. For details about the 
LOCK prefix, see “Lock Prefix” on page 11. 

To perform a decrement operation that updates the CF flag, use a SUB instruction with an immediate 
operand of 1. 


Mnemonic 

Opcode 

Description 

DEC reg/mem8 

FE /I 

Decrement the contents of an 8-bit 
location by 1. 

DEC reg/mem16 

FF/1 

Decrement the contents of a 16-bit 
location by 1. 

DEC reg/mem32 

FF/1 

Decrement the contents of a 32-bit 
location by 1. 

DEC reg/mem64 

FF/1 

Decrement the contents of a 64-bit 
location by 1. 

DEC reg16 

48 +rw 

Decrement the contents of a 16-bit 
(See “REX Prefix” on page 14.) 

DEC reg32 

48 +rd 

Decrement the contents of a 32-bit 
(See “REX Prefix” on page 14.) 


register or memory 
register or memory 
register or memory 
register or memory 
register by 1. 
register by 1. 


Related Instructions 
INC, SUB 
rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




M 

M 

M 

M 


21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded the data segment limit or was 
non-canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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DIV Unsigned Divide 

Divides the unsigned value in a register by the unsigned value in the specified register or memory 
location. The register to be divided depends on the size of the divisor. 

When dividing a word, the dividend is in the AX register. The instruction stores the quotient in the AL 
register and the remainder in the AH register. 

When dividing a doubleword, quadword, or double quadword, the most-significant word of the 
dividend is in the rDX register and the least-significant word is in the rAX register. After the division, 
the instruction stores the quotient in the rAX register and the remainder in the rDX register. 

The following table summarizes the action of this instruction: 


Division Size 

Dividend 

Divisor 

Quotient 

Remainder 

Maximum Quotient 

Word/byte 

AX 

reg/mem8 

AL 

AH 

255 

Doubleword/word 

DX:AX 

reg/mem16 

AX 

DX 

65,535 

Quadword/doubleword 

EDX:EAX 

reg/mem32 

EAX 

EDX 

2 32 - 1 

Double quadword/ 
quadword 

RDX: RAX 

reg/mem64 

RAX 

RDX 

T— 

1 

co 

CM 


The instruction truncates non-integral results towards 0 and the remainder is always less than the 
divisor. An overflow generates a #DE (divide error) exception, rather than setting the CF flag. 

Division by zero generates a divide-by-zero exception. 


Mnemonic 

Opcode 

Description 

DIV reg/mem8 

F6 16 

Perform unsigned division of AX by the contents of an 8- 
bit register or memory location and store the quotient in 
AL and the remainder in AH. 

DIV reg/mem16 

F7 16 

Perform unsigned division of DX:AX by the contents of a 
16-bit register or memory operand store the quotient in 
AX and the remainder in DX. 

DIV reg/mem32 

F7 16 

Perform unsigned division of EDX:EAX by the contents 
of a 32-bit register or memory location and store the 
quotient in EAX and the remainder in EDX. 

DIV reg/mem64 

F7 16 

Perform unsigned division of RDX:RAX by the contents 
of a 64-bit register or memory location and store the 
quotient in RAX and the remainder in RDX. 


Related Instructions 

MUL 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









U 




U 

U 

U 

U 

U 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Divide by zero, #DE 

X 

X 

X 

The divisor operand was 0. 

X 

X 

X 

The quotient was too large for the designated register. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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ENTER Create Procedure Stack Frame 

Creates a stack frame for a procedure. 

The first operand specifies the size of the stack frame allocated by the instruction. 

The second operand specifies the nesting level (0 to 31—the value is automatically masked to 5 bits). 
For nesting levels of 1 or greater, the processor copies earlier stack frame pointers before adjusting the 
stack pointer. This action provides a called procedure with access points to other nested stack frames. 

The 32-bit enter N, 0 (a nesting level of 0) instruction is equivalent to the following 32-bit 
instruction sequence: 


push 

ebp 


; save current EBP 


mov 

ebp. 

esp 

; set stack frame pointer 

value 

sub 

esp. 

N 

; allocate space for local 

variables 


The ENTER and LEAVE instructions provide support for block structured languages. The LEAVE 
instruction releases the stack frame on returning from a procedure. 

In 64-bit mode, the operand size of ENTER defaults to 64 bits, and there is no prefix available for 
encoding a 32-bit operand size. 


Mnemonic 

ENTER imm16, 0 
ENTER imm16, 1 
ENTER imm16, imm8 


Opcode 

C8 iw 00 
C8 iw 01 
C8 iw ib 


Description 

Create a procedure stack frame. 

Create a nested stack frame for a procedure. 
Create a nested stack frame for a procedure. 


Action 

// See "Pseudocode Definition" on page 57. 

ENTER^START: 

temp ALLOC_SPACE = word-sized immediate specified in the instruction 

(first operand), zero-extended to 64 bits 
temp LEVEL = byte-sized immediate specified in the instruction 

(second operand), zero-extended to 64 bits 

temp_LEVEL = tempJLEVEL AND 0x1f 

// only keep 5 bits of level count 

PUSH.v old_RBP 

temp RBP = RSP // This value of RSP will eventually be loaded 

// into RBP. 

IF (temp LEVEL>0) // Push "temp LEVEL" parameters to the stack. 

{ 

FOR (1=1; ICtempJLEVEL; I++) 
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// All but one of the parameters are copied 
// from higher up on the stack. 

{ 

temp_DATA = READ_MEM.v [SS:old_RBP-I*V] 

PUSH.v temp_DATA 

} 

PUSH.v temp RBP // The last parameter is the offset of the old 

// value of RSP on the stack. 

} 

RSP.s = RSP - temp_ALLOC_SPACE // Leave "temp_ALLOC_SPACE" free bytes on 

// the stack 

WRITE MEM.v [SSrRSP.s] = temp unused // ENTER finishes with a memory 

write 

// check on the final stack pointer, 
// but no write actually occurs. 


RBP.v = temp RBP 
EXIT 

Related Instructions 

LEAVE 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack-segment limit or was 
non-canonical. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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IDIV Signed Divide 

Divides the signed value in a register by the signed value in the specified register or memory location. 
The register to be divided depends on the size of the divisor. 

When dividing a word, the dividend is in the AX register. The instruction stores the quotient in the AL 
register and the remainder in the AH register. 

When dividing a doubleword, quadword, or double quadword, the most-significant word of the 
dividend is in the rDX register and the least-significant word is in the rAX register. After the division, 
the instruction stores the quotient in the rAX register and the remainder in the rDX register. 

The following table summarizes the action of this instruction: 


Division Size 

Dividend 

Divisor 

Quotient 

Remainder 

Quotient Range 

Word/byte 

AX 

reg/mem8 

AL 

AH 

-128 to +127 

Doubleword/word 

DX:AX 

reg/mem16 

AX 

DX 

-32,768 to +32,767 

Quadword/doubleword 

EDX:EAX 

reg/mem32 

EAX 

EDX 

-2 31 to 2 31 - 1 

Double quadword/ 
quadword 

RDX: RAX 

reg/mem64 

RAX 

RDX 

1 

CO 

CD 

CM 

O 
-*—< 

CO 

CD 

CM 

1 


The instruction truncates non-integral results towards 0. The sign of the remainder is always the same 
as the sign of the dividend, and the absolute value of the remainder is less than the absolute value of the 
divisor. An overflow generates a #DE (divide error) exception, rather than setting the OF flag. 

To avoid overflow problems, precede this instruction with a CBW, CWD, CDQ, or CQO instruction to 
sign-extend the dividend. 


Mnemonic 

Opcode 

Description 

IDIV reg/mem8 

F6 n 

Perform signed division of AX by the contents of an 8-bit 
register or memory location and store the quotient in AL 
and the remainder in AH. 

IDIV reg/mem16 

F7 n 

Perform signed division of DX:AX by the contents of a 

16-bit register or memory location and store the quotient 
in AX and the remainder in DX. 

IDIV reg/mem32 

F7 n 

Perform signed division of EDX:EAX by the contents of 
a 32-bit register or memory location and store the 
quotient in EAX and the remainder in EDX. 

IDIV reg/mem64 

F7 n 

Perform signed division of RDX:RAX by the contents of 
a 64-bit register or memory location and store the 
quotient in RAX and the remainder in RDX. 
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Related Instructions 

IMUL 


rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









U 




U 

U 

U 

U 

U 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Divide by zero, #DE 

X 

X 

X 

The divisor operand was 0. 

X 

X 

X 

The quotient was too large for the designated register. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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IMUL Signed Multiply 

Multiplies two signed operands. The number of operands detennines the form of the instruction. 

If a single operand is specified, the instruction multiplies the value in the specified general-purpose 
register or memory location by the value in the AL, AX, EAX, or RAX register (depending on the 
operand size) and stores the product in AX, DX:AX, EDX:EAX, or RDX:RAX, respectively. 

If two operands are specified, the instruction multiplies the value in a general-purpose register (first 
operand) by an immediate value or the value in a general-purpose register or memory location (second 
operand) and stores the product in the first operand location. 

If three operands are specified, the instruction multiplies the value in a general-purpose register or 
memory location (second operand), by an immediate value (third operand) and stores the product in a 
register (first operand). 

The IMUL instruction sign-extends an immediate operand to the length of the other register/memory 
operand. 

The CF and OF flags are set if, due to integer overflow, the double-width multiplication result cannot 
be represented in the half-width destination register. Otherwise the CF and OF flags are cleared. 


Mnemonic 

Opcode 

Description 

IMUL reg/mem8 

F6 15 

Multiply the contents of AL by the contents of an 8-bit 
memory or register operand and put the signed result in 
AX. 

IMUL reg/mem16 

F7/5 

Multiply the contents of AX by the contents of a 16-bit 
memory or register operand and put the signed result in 
DX:AX. 

IMUL reg/mem32 

F7/5 

Multiply the contents of EAX by the contents of a 32-bit 
memory or register operand and put the signed result in 
EDX:EAX. 

IMUL reg/mem64 

F7/5 

Multiply the contents of RAX by the contents of a 64-bit 
memory or register operand and put the signed result in 
RDX:RAX. 

IMUL reg16 , reg/mem16 

OF AF /r 

Multiply the contents of a 16-bit destination register by 
the contents of a 16-bit register or memory operand and 
put the signed result in the 16-bit destination register. 

IMUL reg32, reg/mem32 

OF AF /r 

Multiply the contents of a 32-bit destination register by 
the contents of a 32-bit register or memory operand and 
put the signed result in the 32-bit destination register. 

IMUL reg64, reg/mem64 

OF AF /r 

Multiply the contents of a 64-bit destination register by 
the contents of a 64-bit register or memory operand and 
put the signed result in the 64-bit destination register. 

IMUL reg16, reg/mem16, imm8 

6B /rib 

Multiply the contents of a 16-bit register or memory 
operand by a sign-extended immediate byte and put the 
signed result in the 16-bit destination register. 
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Mnemonic 

Opcode 

IMUL reg32 , reg/mem32 , imm8 

6B /rib 

IMUL reg64, reg/mem64, imm8 

6B /rib 

IMUL reg16, reg/mem16, 
imm 16 

69 /r iw 

IMUL reg32 , reg/mem32, 
imm32 

69 /r id 

IMUL reg64, reg/mem64, 
imm32 

69 /rid 


Description 

Multiply the contents of a 32-bit register or memory 
operand by a sign-extended immediate byte and put the 
signed result in the 32-bit destination register. 

Multiply the contents of a 64-bit register or memory 
operand by a sign-extended immediate byte and put the 
signed result in the 64-bit destination register. 

Multiply the contents of a 16-bit register or memory 
operand by a sign-extended immediate word and put 
the signed result in the 16-bit destination register. 

Multiply the contents of a 32-bit register or memory 
operand by a sign-extended immediate double and put 
the signed result in the 32-bit destination register. 

Multiply the contents of a 64-bit register or memory 
operand by a sign-extended immediate double and put 
the signed result in the 64-bit destination register. 


Related Instructions 


IDIV 


rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




U 

U 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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IN Input from Port 

Transfers a byte, word, or doubleword from an I/O port to the AL, AX, or EAX register. The port 
address is specified either by an 8-bit immediate value (OOh to FFh) encoded in the instruction or a 16- 
bit value contained in the DX register (OOOOh to FFFFh). The processor’s I/O address space is distinct 
from system memory addressing. 

For two opcodes (E4h and ECh), the data size of the port is fixed at 8 bits. For the other opcodes (E5h 
and EDh), the effective operand-size determines the port size. If the effective operand size is 64 bits, 
IN reads only 32 bits from the I/O port. 

If the CPL is higher than IOPL, or the mode is virtual mode, IN checks the I/O permission bitmap in 
the TSS before allowing access to the I/O port. (See Volume 2 for details on the TSS I/O permission 
bitmap.) 


Mnemonic 

Opcode 

Description 

INAL, imm8 

E4 ib 

Input a byte from the port at the address specified by 
imm8 and put it into the AL register. 

IN AX, imm8 

E5 ib 

Input a word from the port at the address specified by 
imm8 and put it into the AX register. 

IN EAX, imm8 

E5 ib 

Input a doubleword from the port at the address 
specified by imm8 and put it into the EAX register. 

INAL, DX 

EC 

Input a byte from the port at the address specified by the 
DX register and put it into the AL register. 

IN AX, DX 

ED 

Input a word from the port at the address specified by 
the DX register and put it into the AX register. 

IN EAX, DX 

ED 

Input a doubleword from the port at the address 
specified by the DX register and put it into the EAX 


register. 


Related Instructions 

INSx, OUT, OUTSx 

rFLAGS Affected 

None 


176 


IN 


General-Purpose 
Instruction Reference 



24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 
#GP 


X 


One or more I/O permission bits were set in the TSS for the 
accessed port. 



X 

The CPL was greater than the IOPL and one or more I/O 
permission bits were set in the TSS for the accessed port. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 
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INC Increment by 1 

Adds 1 to the specified register or memory location. The CF flag is not affected, even if the operand is 
incremented to 0000. 

The one-byte forms of this instruction (opcodes 40 through 47) are used as REX prefixes in 64-bit 
mode. See “REX Prefix” on page 14. 

The fonns of the INC instruction that write to memory support the LOCK prefix. For details about the 
LOCK prefix, see “Lock Prefix” on page 11. 

To perform an increment operation that updates the CF flag, use an ADD instruction with an 
immediate operand of 1. 


Mnemonic 

Opcode 

Description 

INC reg/mem8 

FE 10 

Increment the contents of an 8-bit register or memory 
location by 1. 

INC reg/mem 16 

FF 10 

Increment the contents of a 16-bit register or memory 
location by 1. 

INC reg/mem32 

FF 10 

Increment the contents of a 32-bit register or memory 
location by 1. 

INC reg/m em 64 

FF 10 

Increment the contents of a 64-bit register or memory 
location by 1. 

INC reg16 

40 +rw 

Increment the contents of a 16-bit register by 1. 
(These opcodes are used as REX prefixes in 64-bit 
mode. See “REX Prefix” on page 14.) 

INC reg32 

40 +rd 

Increment the contents of a 32-bit register by 1. 
(These opcodes are used as REX prefixes in 64-bit 


mode. See “REX Prefix” on page 14.) 


Related Instructions 

ADD, DEC 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




M 

M 

M 

M 


21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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INS Input String 

INSB 

INSW 

INSD 

Transfers data from the I/O port specified in the DX register to an input buffer specified in the rDI 
register and increments or decrements the rDI register according to the setting of the DF flag in the 
rFLAGS register. 

If the DF flag is 0, the instruction increments rDI by 1,2, or 4, depending on the number of bytes read. 
If the DF flag is 1, it decrements the pointer by 1,2, or 4. 

In 16-bit and 32-bit mode, the INS instruction always uses ES as the data segment. The ES segment 
cannot be overridden with a segment override prefix. In 64-bit mode, INS always uses the 
unsegmented memory space. 

The INS instructions use the explicit memory operand (first operand) to determine the size of the I/O 
port, but always use ES:[rDI] for the location of the input buffer. The explicit register operand (second 
operand) specifies the I/O port address and must always be DX. 

The INSB, INSW, and INSD instructions copy byte, word, and doubleword data, respectively, from 
the I/O port (OOOOh to FFFFh) specified in the DX register to the input buffer specified in the ES:rDI 
registers. 

If the operand size is 64-bits, the instruction behaves as if the operand size were 32-bits. 

If the CPL is higher than the IOPL or the mode is virtual mode, INSx checks the I/O pennission bitmap 
in the TSS before allowing access to the I/O port. (See volume 2 for details on the TSS I/O permission 
bitmap.) 

The INSx instructions support the REP prefix for block input of rCX bytes, words, or doublewords. 
For details about the REP prefix, see “Repeat Prefixes” on page 12. 


Mnemonic 

Opcode 

Description 

INS mem8, DX 

6C 

Input a byte from the port specified by DX, put it into the 
memory location specified in ES:rDI, and then 
increment or decrement rDI. 

INS mem16, DX 

6D 

Input a word from the port specified by DX register, put it 
into the memory location specified in ES:rDI, and then 
increment or decrement rDI. 

INS mem32, DX 

6D 

Input a doubleword from the port specified by DX, put it 
into the memory location specified in ES:rDI, and then 
increment or decrement rDI. 

INSB 

6C 

Input a byte from the port specified by DX, put it into the 
memory location specified in ES:rDI, and then 
increment or decrement rDI. 
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Mnemonic 


INSW 


INSD 


Opcode Description 

Input a word from the port specified by DX, put it into the 
6D memory location specified in ES:rDI, and then 

increment or decrement rDI. 

Input a doubleword from the port specified by DX, put it 
6D into the memory location specified in ES:rDI, and then 

increment or decrement rDI. 


Related Instructions 

IN, OUT, OUTSx 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 


X 


One or more I/O permission bits were set in the TSS for the 
accessed port. 



X 

The CPL was greater than the IOPL and one or more I/O 
permission bits were set in the TSS for the accessed port. 



X 

A null data segment was used to reference memory. 



X 

The destination operand was in a non-writable segment. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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I NT I nterru pt to Vector 

Transfers execution to the interrupt handler specified by an 8-bit unsigned immediate value. This value 
is an interrupt vector number (OOh to FFh), which the processor uses as an index into the interrupt- 
descriptor table (IDT). 

For detailed descriptions of the steps perfonned by INTn instructions, see the following: 

• Legacy-Mode Interrupts: “Virtual-8086 Mode Interrupt Control Transfers” in Volume 2. 

• Long-Mode Interrupts: “Long-Mode Interrupt Control Transfers” in Volume 2. 

See also the descriptions of the INT3 instruction on page 367 and the INTO instruction on page 189. 


Mnemonic 

Opcode 

Description 

INT imm8 

CD ib 

Call interrupt service routine specified by interrupt 
vector imm8. 

Action 



// See "Pseudocode 

Definition" 

on page 57. 

INT N START: 




IF (REALJMODE) 
INT_N_REAL 

ELSIF (PROTECTED_MODE) 
INT_N_PROTECTED 
ELSE // (VIRTUAL_MODE) 
INT N VIRTUAL 


INT_N_REAL: 

temp int n^vector = byte-sized interrupt vector specified in the instruc¬ 
tion, 

zero-extended to 64 bits 


temp RIP = READ MEM.w [idtrtemp int n vector*4] 

// read target CS:RIP from the real-mode idt 
temp_CS = READ MEM.w [idtrtemp int n vector*4+2] 

PUSH.w old_RFLAGS 
PUSH.w old_CS 
PUSH.w next_RIP 

IF (temp_RIP>CS.limit) 

EXCEPTION [#GP] 

CS.sel = temp_CS 

CS.base = temp^CS SHL 4 


182 


INT 


General-Purpose 
Instruction Reference 



24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


RFLAGS.AC,TF,IF,RF cleared 

RIP = temp RIP 

EXIT 


INT N PROTECTED: 


tion. 


temp int n vector = byte-sized interrupt vector specified in the instruc- 

zero-extended to 64 bits 
temp idt desc = READ IDT (temp int n vector) 


IF (temp_idt_desc.attr.type == 'taskgate') 

TASK_SWITCH // using tss selector in the task gate as the target tss 

IF (LONG_MODE) // The size of the gate controls the size of the 
// stack pushes. 

V=8-byte // Long mode only uses 64-bit gates. 

ELSIF ((temp_idt_desc.attr.type == 'intgate32') 

I (temp_idt_desc.attr.type == 'trapgate32')) 

V=4-byte // Legacy mode, using a 32-bit gate 

ELSE // gate is intgatel6 or trapgatel6 

V=2-byte // Legacy mode, using a 16-bit gate 


temp RIP = temp idt desc.offset 


IF (LONG_MODE) 

// In long mode, we need to read the 2nd half of a 
// 16-byte interrupt-gate from the IDT, to get the 
// upper 32 bits of the target RIP 

{ 

temp upper = READ MEM.q [idt:temp int n_vector*16+8] 


RIP 


temp RIP = tempRIP + (temp upper SHL 32) // concatenate both halves of 


CS = READ DESCRIPTOR (temp^idt desc.segment, intcs chk) 

IF (CS.attr.conforming==l) 
tempjSPL = CPL 
ELSE 

temp^CPL = CS.attr.dpi 

IF (CPL==temp^CPL) // no privilege-level change 

{ 

IF (LONG_MODE) 

{ 

IF (temp_idt_desc.ist!=0) 

// In long mode, if the IDT gate specifies an IST pointer, 
// a stack-switch is always done 
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stack 


RSP = READ_MEM.q [tss:ist_index*8+28] 

RSP = RSP AND OxFFFFFFFFFFFFFFFO 

// In long mode, interrupts/exceptions align RSP to a 
// 16-byte boundary 

PUSH.q old SS // In long mode, SS:RSP is always pushed to the 
PUSH.q old^RSP 

} 

PUSH.v old^RFLAGS 
PUSH.v old_CS 
PUSH.v next_RIP 

IF ((64BIT^MODE) && (temp RIP is non-canonical) 

|| (!64BIT_MODE) && (temp_RIP > CS.limit)) 

EXCEPTION [#GP(0)] 

RFLAGS.VM,NT,TF,RF cleared 
RFLAGS.IF cleared if interrupt gate 


RIP = temp RIP 
EXIT 

} 

ELSE // (CPL > temp CPL), changing privilege level 

{ 

CPL = temp_CPL 

temp_SS_desc:temp_RSP = READ_INNER_LEVEL_STACK_POINTER 

(CPL, temp_idt_desc.ist) 


IF (LONG_MODE) 

temp_RSP = temp_RSP AND OxFFFFFFFFFFFFFFFO 

// in long mode, interrupts/exceptions align rsp 
// to a 16-byte boundary 


RSP.q = temp RSP 
SS = temp_SS_desc 

PUSH.v old_SS // #SS on the following pushes uses SS.sel as error code 

PUSH.v old_RSP 

PUSH.v old_RFLAGS 

PUSH.v old_CS 

PUSH.v next_RIP 

IF ((64BIT MODE) && (temp RIP is non-canonical) 

|| (!64BIT_MODE) && (temp_RIP > CS.limit)) 

EXCEPTION [#GP(0)] 

RFLAGS.VM,NT,TF,RF cleared 
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RFLAGS.IF cleared if interrupt gate 

RIP = temp RIP 

EXIT 

} 

INT N VIRTUAL: 


temp int n^vector 

tion. 


byte-sized interrupt vector specified in the instruc- 
zero-extended to 64 bits 


IF (CR4.VME==0) // vme isn't enabled 

{ 

IF (RFLAGS.I0PL==3) 

INT_N_VIRTUAL_TO_PROTECTED 

ELSE 

EXCEPTION [#GP(0)] 

} 


temp_IRB_BASE = READ_MEM.w [tss:102] - 32 

// check the vme Int-n Redirection Bitmap (IRB), to see 
// if we should redirect this interrupt to a virtual-mode 
// handler 

t emp_VME_RE DIREC TION_BIT = READ_BIT_ARRAY ([tss:temp_IRB_BASE], 

temp int n^vector) 

IF (temp_VME_REDIRECTION_BIT==l) 

{ // the virtual-mode int-n bitmap bit is set, so don't 

// redirect this interrupt 
IF (RFLAGS.I0PL==3) 

INT_N_VIRTUAL_TO_PROTECTED 

ELSE 

EXCEPTION [#GP(0)] 

} 

ELSE // redirect interrupt through virtual-mode idt 

{ 

temp RIP = READ MEM.w [Ortemp int n_vector*4] 

// read target CS:RIP from the virtual-mode idt at 
// linear address 0 

temp_CS = READ MEM.w [Ortemp int n_vector*4+2] 


= 3 


IF (RFLAGS.IOPL < 3) 

old_RFLAGS = old_RFLAGS with VIF bit shifted into IF bit, and IOPL 


PUSH.w old^RFLAGS 
PUSH.w old_CS 
PUSH.w next_RIP 

CS.sel = temp_CS 

CS.base = temp_CS SHL 4 
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RFLAGS.TF,RF cleared 

RIP = temp_RIP // RFLAGS.IF cleared if IOPL == 3 

// RFLAGS.VIF cleared if IOPL < 3 


EXIT 

} 


INT_N_VIRTUAL_TO_PROTECTED: 

temp idt desc = READ IDT (temp_int n vector) 

IF (temp_idt_desc.attr.type == 'taskgate') 

TASK_SWITCH // using tss selector in the task gate as the target tss 

IF ((temp_idt_desc.attr.type == 'intgate32') 

I (temp_idt_desc.attr.type == 'trapgate32')) 

// the size of the gate controls the size of the stack pushes 
V=4-byte // legacy mode, using a 32-bit gate 
ELSE // gate is intgatel6 or trapgatel6 

V=2-byte // legacy mode, using a 16-bit gate 

temp RIP = temp idt desc.offset 

CS = READ DESCRIPTOR (temp^idt desc.segment, intcs chk) 

IF (CS.attr.dpi!=0) // Handler must run at CPL 0. 

EXCEPTION [#GP(CS.sel)] 

CPL = 0 

temp_ist =0 // Legacy mode doesn't use ist pointers 

temp_SS_desc:temp_RSP = READ_INNER_LEVELJ3TACK_POINTER (CPL, temp_ist) 

RSP.q = temp RSP 
SS = temp_SS_desc 

PUSH.v old_GS // #SS on the following pushes use SS.sel as error code. 

PUSH.v old_FS 

PUSH.v old_DS 

PUSH.v old_ES 

PUSH.v old_SS 

PUSH.v old_RSP 

PUSH.v old_RFLAGS // Pushed with RF clear. 

PUSH.v old_CS 
PUSH.v next RIP 


IF (temp_RIP > CS.limit) 
EXCEPTION [#GP(0)] 


DS 

= NULL 

// 

can' 

t 

use 

virtual-mode 

selectors 

in 

protected 

mode 

ES 

= NULL 

// 

can' 

t 

use 

virtual-mode 

selectors 

in 

protected 

mode 

FS 

= NULL 

// 

can' 

t 

use 

virtual-mode 

selectors 

in 

protected 

mode 

GS 

= NULL 

// 

can' 

t 

use 

virtual-mode 

selectors 

in 

protected 

mode 
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RFLAGS.VM,NT,TF,RF cleared 
RFLAGS.IF cleared if interrupt gate 


RIP = temp RIP 
EXIT 

Related Instructions 

INT 3, INTO, BOUND 

rFLAGS Affected 


If a task switch occurs, all flags are modified. Otherwise settings are as follows: 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 



M 

M 

M 

0 

M 




M 

0 






21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 



X 

X 

As part of a stack switch, the target stack segment selector or 
rSP in the TSS was beyond the TSS limit. 



X 

X 

As part of a stack switch, the target stack segment selector in 
the TSS was a null selector. 



X 

X 

As part of a stack switch, the target stack segment selector’s 

Tl bit was set, but the LDT selector was a null selector. 

Invalid TSS, #TS 
(selector) 


X 

X 

As part of a stack switch, the target stack segment selector in 
the TSS was beyond the limit of the GDT or LDT descriptor 
table. 



X 

X 

As part of a stack switch, the target stack segment selector in 
the TSS contained a RPL that was not equal to its DPL. 



X 

X 

As part of a stack switch, the target stack segment selector in 
the TSS contained a DPL that was not equal to the CPL of the 
code segment selector. 



X 

X 

As part of a stack switch, the target stack segment selector in 
the TSS was not a writable segment. 

Segment not 
present, #NP 
(selector) 


X 

X 

The accessed code segment, interrupt gate, trap gate, task 
gate, or TSS was not present. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical, and no stack switch occurred. 
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Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 
(selector) 


X 

X 

After a stack switch, a memory address exceeded the stack 
segment limit or was non-canonical. 


X 

X 

As part of a stack switch, the SS register was loaded with a 
non-null segment selector and the segment was marked not 
present. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 

X 

X 

X 

The target offset exceeded the code segment limit or was non- 
canonical. 


X 


The IOPL was less than 3 and CR4.VME was 0. 


X 


IOPL was less than 3, CR4.VME was 1, and the 
corresponding bit in the VME interrupt redirection bitmap was 

1 . 

General protection, 
#GP 

(selector) 

X 

X 

X 

The interrupt vector was beyond the limit of IDT. 


X 

X 

The descriptor in the IDT was not an interrupt, trap, or task 
gate in legacy mode or not a 64-bit interrupt or trap gate in 
long mode. 


X 

X 

The DPL of the interrupt, trap, or task gate descriptor was less 
than the CPL. 


X 

X 

The segment selector specified by the interrupt or trap gate 
had its Tl bit set, but the LDT selector was a null selector. 


X 

X 

The segment descriptor specified by the interrupt or trap gate 
exceeded the descriptor table limit or was a null selector. 


X 

X 

The segment descriptor specified by the interrupt or trap gate 
was not a code segment in legacy mode, or not a 64-bit code 
segment in long mode. 



X 

The DPL of the segment specified by the interrupt or trap gate 
was greater than the CPL. 


X 


The DPL of the segment specified by the interrupt or trap gate 
pointed was not 0 or it was a conforming segment. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 


188 


INT 


General-Purpose 
Instruction Reference 






24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


INTO Interrupt to Overflow Vector 

Checks the overflow flag (OF) in the rFLAGS register and calls the overflow exception (#OF) handler 
if the OF flag is set to 1. This instruction has no effect if the OF flag is cleared to 0. The INTO 
instruction detects overflow in signed number addition. See AMD64 Architecture Programmer’s 
Manual Volume 1: Application Programming for more information on the OF flag. 

Using this instruction in 64-bit mode generates an invalid-opcode exception. 

For detailed descriptions of the steps perfonned by INT instructions, see the following: 

• Legacy-Mode Interrupts: “Legacy Protected-Mode Interrupt Control Transfers” in Volume 2. 

• Long-Mode Interrupts: “Long-Mode Interrupt Control Transfers” in Volume 2. 

Opcode Description 

r F Call overflow exception if the overflow flag is set. 

(Invalid in 64-bit mode.) 

Action 

IF (64BIT_MODE) 

EXCEPTION[#UD] 

IF (RFLAGS.OF == 1) // #OF is a trap, and pushes the rIP of the instruction 

EXCEPTION [#OF] // following INTO. 

EXIT 

Related Instructions 

INT, INT 3, BOUND 

rFLAGS Affected 

None. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Overflow, #OF 

X 

X 

X 

The INTO instruction was executed with OF set to 1. 

Invalid opcode, 

#UD 



X 

Instruction was executed in 64-bit mode. 


Mnemonic 

INTO 
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Jcc Jump on Condition 

Checks the status flags in the rFLAGS register and, if the flags meet the condition specified by the 
condition code in the mnemonic ( cc ), jumps to the target instruction located at the specified relative 
offset. Otherwise, execution continues with the instruction following the Jcc instruction. 

Unlike the unconditional jump (JMP), conditional jump instructions have only two forms —short and 
near conditional jumps. Different opcodes correspond to different forms of one instruction. For 
example, the JO instruction (jump if overflow) has opcode OFh 80h for its near form and 70h for its 
short form, but the mnemonic is the same for both forms. The only difference is that the near form has 
a 16- or 32-bit relative displacement, while the short fonn always has an 8-bit relative displacement. 

Mnemonics are provided to deal with the programming semantics of both signed and unsigned 
numbers. Instructions tagged A (above) and B (below) are intended for use in unsigned integer code; 
those tagged G (greater) and L (less) are intended for use in signed integer code. 

If the jump is taken, the signed displacement is added to the rIP (of the following instruction) and the 
result is truncated to 16, 32, or 64 bits, depending on operand size. 

In 64-bit mode, the operand size defaults to 64 bits. The processor sign-extends the 8-bit or 32-bit 
displacement value to 64 bits before adding it to the RIP. 

These instructions cannot perform far jumps (to other code segments). To create a far-conditional- 
jump code sequence corresponding to a high-level language statement like: 

IF A == B THEN GOTO FarLabel 

where FarLabel is located in another code segment, use the opposite condition in a conditional short 
jump before an unconditional far jump. Such a code sequence might look like: 

cmp A,B ; compare operands 

jne Nextlnstr ; continue program if not equal 

jmp far FarLabel ; far jump if operands are equal 


Nextlnstr: ; continue program 


For details about 

control-flow instructions, 

see “Control Transfers” in Volume 1, and “Control- 

Transfer Privilege 

Checks” in Volume 2. 


Mnemonic 

Opcode 

Description 

JO rel8off 

70 cb 


JO rel16off 

OF 80 cw 

Jump if overflow (OF = 1). 

JO rel32off 

OF 80 cd 

JNO rel8off 

71 cb 


JNO rel16off 

OF 81 cw 

Jump if not overflow (OF = 0). 

JNO rel32off 

OF 81 cd 

JB rel8off 

72 cb 


JB rell 6off 

OF 82 cw 

Jump if below (OF = 1). 

JB rel32off 

OF 82 cd 
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Mnemonic 

Opcode 

JC rel8off 

JC rel16off 

JC rel32off 

72 cb 

OF 82 cw 
OF 82 cd 

JNAE rel8off 

JNAE rell6off 

JNAE rel32off 

72 cb 

OF 82 cw 
OF 82 cd 

JNB rel8off 

JNB rel16off 

JNB rel32off 

73 cb 

OF 83 cw 
OF 83 cd 

JNC rel8off 

JNC rel16off 

JNC rel32off 

73 cb 

OF 83 cw 
OF 83 cd 

JAE rel8off 

JAE rel16off 

JAE rel32off 

73 cb 

OF 83 cw 
OF 83 cd 

JZ rel8off 

JZ rel16off 

JZ rel32off 

74 cb 

OF 84 cw 
OF 84 cd 

JE rel8off 

JE rel16off 

JE rel32off 

74 cb 

OF 84 cw 
OF 84 cd 

JNZ rel8off 

JNZ rel16off 

JNZ rel32off 

75 cb 

OF 85 cw 
OF 85 cd 

JNE rel8off 

JNE rel16off 

JNE rel32off 

75 cb 

OF 85 cw 
OF 85 cd 

JBE rel8off 

JBE rel16off 

JBE rel32off 

76 cb 

OF 86 cw 
OF 86 cd 

JNA rel8off 

JNA rel16off 

JNA rel32off 

76 cb 

OF 86 cw 
OF 86 cd 

JNBE rel8off 

JNBE rell6off 

JNBE rel32off 

77 cb 

OF 87 cw 
OF 87 cd 

JA rel8off 

JA rel16off 

JA rel32off 

77 cb 

OF 87 cw 
OF 87 cd 

JS rel8off 

JS rel16off 

JS rel32off 

78 cb 

OF 88 cw 
OF 88 cd 

JNS rel8off 

JNS rel16off 

JNS rel32off 

79 cb 

OF 89 cw 
OF 89 cd 


Description 

Jump if carry (CF = 1). 

Jump if not above or equal (CF = 1). 

Jump if not below (CF = 0). 

Jump if not carry (CF = 0). 

Jump if above or equal (CF = 0). 

Jump if zero (ZF = 1). 

Jump if equal (ZF = 1). 

Jump if not zero (ZF = 0). 

Jump if not equal (ZF = 0). 

Jump if below or equal (CF = 1 or ZF = 1). 

Jump if not above (CF = 1 or ZF = 1). 

Jump if not below or equal (CF = 0 and ZF = 0) 

Jump if above (CF = 0 and ZF = 0). 

Jump if sign (SF = 1). 

Jump if not sign (SF = 0). 
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Mnemonic 

Opcode 

JP rel8off 

JP rell 6off 

JP rel32off 

7 A cb 

OF 8A cw 
OF 8A cd 

JPE rel8off 

JPE rel16off 

JPE rel32off 

7A cb 

OF 8A cw 
OF 8A cd 

JNP rel8off 

JNP rel16off 

JNP rel32off 

7B cb 

OF 8B cw 
OF 8B cd 

JPO rel8off 

JPO rel16off 

JPO rel32off 

7B cb 

OF 8B cw 
OF 8B cd 

JL rel8off 

JL rel16off 

JL rel32off 

7C cb 

OF 8C cw 
OF 8C cd 

JNGE rel8off 

JNGE rel16off 

JNGE rel32off 

7C cb 

OF 8C cw 
OF 8C cd 

JNL rel8off 

JNL rel16off 

JNL rel32off 

7D cb 

OF 8D cw 
OF 8D cd 

JGE rel8off 

JGE rel16off 

JGE rel32off 

7D cb 

OF 8D cw 
OF 8D cd 

JLE rel8off 

JLE rell6off 

JLE rel32off 

7E cb 

OF 8E cw 
OF 8E cd 

JNG rel8off 

JNG rel16off 

JNG rel32off 

7E cb 

OF 8E cw 
OF 8E cd 

JNLE rel8off 

JNLE rel16off 

JNLE rel32off 

7F cb 

OF 8F cw 
OF 8F cd 

JG rel8off 

JG rell6off 

JG re!32off 

7F cb 

OF 8F cw 
OF 8F cd 


Description 

Jump if parity (PF = 1). 

Jump if parity even (PF = 1). 

Jump if not parity (PF = 0). 

Jump if parity odd (PF = 0). 

Jump if less (SF <> OF). 

Jump if not greater or equal (SF <> OF). 

Jump if not less (SF = OF). 

Jump if greater or equal (SF = OF). 

Jump if less or equal (ZF = 1 or SF <> OF). 

Jump if not greater (ZF = 1 or SF <> OF). 

Jump if not less or equal (ZF = 0 and SF = OF). 

Jump if greater (ZF = 0 and SF = OF). 


Related Instructions 

JMP (Near), JMP (Far), JrCXZ 

rFLAGS Affected 

None 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 
#GP 

X 

X 

X 

The target offset exceeded the code segment limit or was non- 
canonical. 
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JCXZ Jump if rCX Zero 

JECXZ 

JRCXZ 

Checks the contents of the count register (rCX) and, if 0, jumps to the target instruction located at the 
specified 8-bit relative offset. Otherwise, execution continues with the instruction following the 
JrCXZ instruction. 

The size of the count register (CX, ECX, or RCX) depends on the address-size attribute of the JrCXZ 
instruction. Therefore, JRCXZ can only be executed in 64-bit mode and JCXZ cannot be executed in 
64-bit mode. 

If the jump is taken, the signed displacement is added to the rIP (of the following instruction) and the 
result is truncated to 16, 32, or 64 bits, depending on operand size. 

In 64-bit mode, the operand size defaults to 64 bits. The processor sign-extends the 8-bit displacement 
value to 64 bits before adding it to the RIP. 

For details about control-flow instructions, see “Control Transfers” in Volume 1, and “Control- 
Transfer Privilege Checks” in Volume 2. 


Mnemonic 

Opcode 

Description 

JCXZ rel8off 

E3 cb 

Jump short if the 16-bit count register (CX) is zero. 

JECXZ rel8off 

E3 cb 

Jump short if the 32-bit count register (ECX) is zero. 

JRCXZ rel8off 

E3 cb 

Jump short if the 64-bit count register (RCX) is zero 


Related Instructions 

Jcc, JMP (Near), JMP (Far) 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 
#GP 

X 

X 

X 

The target offset exceeded the code segment limit or was non- 
canonical 
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JMP (Near) Near Jump 

Unconditionally transfers control to a new address without saving the current rIP value. This form of 
the instruction jumps to an address in the current code segment and is called a near jump. The target 
operand can specify a register, a memory location, or a label. 

If the JMP target is specified in a register or memory location, then a 16-, 32-, or 64-bit rIP is read from 
the operand, depending on operand size. This rIP is zero-extended to 64 bits. 

If the JMP target is specified by a displacement in the instruction, the signed displacement is added to 
the rIP (of the following instruction), and the result is truncated to 16, 32, or 64 bits depending on 
operand size. The signed displacement can be 8 bits, 16 bits, or 32 bits, depending on the opcode and 
the operand size. 

For near jumps in 64-bit mode, the operand size defaults to 64 bits. The E9 opcode results in RIP = RIP 
+ 32-bit signed displacement, and the FF /4 opcode results in RIP = 64-bit offset from register or 
memory. No prefix is available to encode a 32-bit operand size in 64-bit mode. 

See JMP (Far) for information on far jumps—jumps to procedures located outside of the current code 
segment. For details about control-flow instructions, see “Control Transfers” in Volume 1, and 
“Control-Transfer Privilege Checks” in Volume 2. 


Mnemonic 

Opcode 

Description 

JMP rel8off 

EB cb 

Short jump with the target specified by an 8-bit signed 
displacement. 

JMP rel16off 

E9 cw 

Near jump with the target specified by a 16-bit signed 
displacement. 

JMP rel32off 

E9 cd 

Near jump with the target specified by a 32-bit signed 
displacement. 

JMP reg/mem16 

FF/4 

Near jump with the target specified reg/mem16. 

JMP reg/mem32 

FF/4 

Near jump with the target specified reg/mem32. 

(No prefix for encoding in 64-bit mode.) 

JMP reg/mem64 

FF/4 

Near jump with the target specified reg/mem64. 


Related Instructions 

JMP (Far), Jcc, JrCX 

rFLAGS Affected 

None. 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 

X 

X 

X 

The target offset exceeded the code segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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JMP (Far) Far Jump 

Unconditionally transfers control to a new address without saving the current CS:rIP values. This form 
of the instruction jumps to an address outside the current code segment and is called a far jump. The 
operand specifies a target selector and offset. 

The target operand can be specified by the instruction directly, by containing the far pointer in the jmp 
far opcode itself, or indirectly, by referencing a far pointer in memory. In 64-bit mode, only indirect far 
jumps are allowed, executing a direct far jmp (opcode EA) will generate an undefined opcode 
exception. For both direct and indirect far jumps, if the JMP (Far) operand-size is 16 bits, the 
instruction's operand is a 16-bit selector followed by a 16-bit offset. If the operand-size is 32 or 64 bits, 
the operand is a 16-bit selector followed by a 32-bit offset. 

In all modes, the target selector used by the instruction can be a code selector. Additionally, the target 
selector can also be a call gate in protected mode, or a task gate or TSS selector in legacy protected 
mode. 

• Target is a code segment —Control is transferred to the target CS:rIP. In this case, the target offset 
can only be a 16 or 32 bit value, depending on operand-size, and is zero-extended to 64 bits; 64-bit 
offsets are only available via call gates. No CPF change is allowed. 

• Target is a call gate —The call gate specifies the actual target code segment and offset, and control 
is transferred to the target CS:rIP. When jumping through a call gate, the size of the target rIP is 16, 
32, or 64 bits, depending on the size of the call gate. If the target rIP is less than 64 bits, it's zero- 
extended to 64 bits. In long mode, only 64-bit call gates are allowed, and they must point to 64-bit 
code segments. No CPF change is allowed. 

• Target is a task gate or a TSS —If the mode is legacy protected mode, then a task switch occurs. See 
“Hardware Task-Management in Fegacy Mode” in volume 2 for details about task switches. 
Hardware task switches are not supported in long mode. 

See JMP (Near) for infonnation on near jumps—jumps to procedures located inside the current code 
segment. For details about control-flow instructions, see “Control Transfers” in Volume 1, and 
“Control-Transfer Privilege Checks” in Volume 2. 


Mnemonic 

Opcode 

JMP FAR pntr16:16 

EA cd 

JMP FAR pntrl6:32 

EA cp 

JMP FAR mem16:16 

FF 15 

JMP FAR mem16:32 

FF 15 


Description 

Far jump direct, with the target specified by a far pointer 
contained in the instruction. (Invalid in 64-bit mode.) 

Far jump direct, with the target specified by a far pointer 
contained in the instruction. (Invalid in 64-bit mode.) 

Far jump indirect, with the target specified by a far 
pointer in memory (16-bit operand size). 

Far jump indirect, with the target specified by a far 
pointer in memory (32- and 64-bit operand size). 
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Action 

// Far jumps (JMPF) 

// See "Pseudocode Definition" on page 57. 

JMPF_START: 

IF (REAL_MODE) 

JMPF_REAL_OR_VIRTUAL 
ELSIF (PROTECTED_MODE) 

JMPF^PROTECTED 
ELSE // (VIRTUAL_MODE) 

JMPF REAL OR VIRTUAL 


JMPF_REAL_OR_VIRTUAL: 

IF (OPCODE == jmpf [mem]) //JMPF Indirect 

{ 

temp RIP = READ MEM.z [mem] 
temp_CS = READ MEM.w [mem+Z] 

} 

ELSE // (OPCODE == jmpf direct) 

{ 

temp RIP = z-sized offset specified in the instruction, 
zero-extended to 64 bits 

temp_CS = selector specified in the instruction 

} 

IF (temp_RIP>CS.limit) 

EXCEPTION [#GP(0)] 

CS.sel = temp_CS 

CS.base = temp_CS SHL 4 

RIP = temp RIP 

EXIT 

JMPF_PROTECTED: 

IF (OPCODE == jmpf [mem]) // JMPF Indirect 

{ 

temp_offset = READ MEM.z [mem] 
temp_sel = READ MEM.w [mem+Z] 

} 

ELSE // (OPCODE == jmpf direct) 

{ 

IF (64BIT_MODE) 

EXCEPTION [#UD] // 'jmpf direct' is illegal in 64-bit mode 

temp_offset = z-sized offset specified in the instruction, 
zero-extended to 64 bits 

temp_sel = selector specified in the instruction 

} 
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temp desc = READ DESCRIPTOR (temp sel, cs chk) 

// read descriptor, perform protection and type checks 

IF (temp_desc.attr.type == 'available_tss') 

TASK_SWITCH // using temp_sel as the target tss selector 

ELSIF (temp_desc.attr.type == 'taskgate') 

TASK_SWITCH // using the tss selector in the task gate as the 

// target tss 

ELSIF (temp_desc.attr.type == 'code') 

// if the selector refers to a code descriptor, then 
// the offset we read is the target RIP 

{ 

temp RIP = temp offset 
CS = temp_desc 

IF ((!64BIT_MODE) && (temp_RIP > CS.limit)) 

// temp RIP can't be non-canonical because 
// it's a 16- or 32-bit offset, zero-extended to 64 bits 

{ 

EXCEPTION [#GP(0)] 

} 

RIP = temp RIP 
EXIT 


ELSE 

{ 

// (temp_desc.attr.type == 'callgate') 

// if the selector refers to a call gate, then 
// the target CS and RIP both come from the call gate 
temp RIP = temp desc.offset 

IF (LONG_MODE) 

{ 

// in long mode, we need to read the 2nd half of a 16-byte call-gate 
// from the gdt/ldt to get the upper 32 bits of the target RIP 
temp upper = READ MEM.q [temp sel+8] 

IF (temp upper's extended attribute bits != 0) 

EXCEPTION [#GP(temp_sel)] // Make sure the extended 

// attribute bits are all zero. 

temp RIP = tempRIP + (temp upper SHL 32) 

// concatenate both halves of RIP 

} 

CS = READ DESCRIPTOR (temp desc.segment, clg_chk) 

// set up new CS base, attr, limits 
IF ((64BIT MODE) && (temp RIP is non-canonical) 

|| (!64BIT_MODE) && (temp_RIP > CS.limit)) 

EXCEPTION [#GP(0)] 

RIP = temp RIP 
EXIT 

} 
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Related Instructions 

JMP (Near), Jcc, JrCX 

rFLAGS Affected 

None, unless a task switch occurs, in which case all flags are modified. 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

The far JUMP indirect opcode (FF /5) had a register operand. 



X 

The far JUMP direct opcode (EA) was executed in 64-bit 
mode. 

Segment not 
present, #NP 
(selector) 



X 

The accessed code segment, call gate, task gate, or TSS was 
not present. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 

X 

X 

X 

The target offset exceeded the code segment limit or was non- 
canonical. 


X A null data segment was used to reference memory. 
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Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 




X 

The target code segment selector was a null selector. 




X 

A code, call gate, task gate, or TSS descriptor exceeded the 
descriptor table limit. 




X 

A segment selector’s Tl bit was set, but the LDT selector was 
a null selector. 




X 

The segment descriptor specified by the instruction was not a 
code segment, task gate, call gate or available TSS in legacy 
mode, or not a 64-bit code segment or a 64-bit call gate in long 
mode. 




X 

The RPL of the non-conforming code segment selector 
specified by the instruction was greater than the CPL, or its 
DPL was not equal to the CPL. 

General protection, 
#GP 

(selector) 



X 

The DPL of the conforming code segment descriptor specified 
by the instruction was greater than the CPL. 



X 

The DPL of the callgate, taskgate, or TSS descriptor specified 
by the instruction was less than the CPL or less than its own 
RPL. 




X 

The segment selector specified by the call gate or task gate 
was a null selector. 




X 

The segment descriptor specified by the call gate was not a 
code segment in legacy mode or not a 64-bit code segment in 
long mode. 




X 

The DPL of the segment descriptor specified the call gate was 
greater than the CPL and it is a conforming segment. 




X 

The DPL of the segment descriptor specified by the callgate 
was not equal to the CPL and it is a non-conforming segment. 




X 

The 64-bit call gate’s extended attribute bits were not zero. 




X 

The TSS descriptor was found in the LDT. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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LAHF Load Status Flags into AH Register 

Loads the lower 8 bits of the rFLAGS register, including sign flag (SF), zero flag (ZF), auxiliary carry 
flag (AF), parity flag (PF), and carry flag (CF), into the AH register. 

The instruction sets the reserved bits 1, 3, and 5 of the rFLAGS register to 1,0, and 0, respectively, in 
the AH register. 

The LAHF instruction is available in 64-bit mode if CPUID Fn8000_0001_ECX[LahfSahf] = 1. It is 
always available in the other operating modes (including compatibility mode) 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Description 

Load the SF, ZF, AF, PF, and CF flags into the AH 
register. 

Related Instructions 

SAHF 

rFLAGS Affected 

None. 


Mnemonic Opcode 

LAHF 9F 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, 

#UD 



X 

The LAHF instruction is not supported in 64-bit mode, as 
indicated by CPUID Fn8000_0001_ECX[LahfSahf] = 0. 
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LDS Load Far Pointer 

LES 

LFS 

LGS 

LSS 

Loads a far pointer from a memory location (second operand) into a segment register (mnemonic) and 
general-purpose register (first operand). The instruction stores the 16-bit segment selector of the 
pointer into the segment register and the 16-bit or 32-bit offset portion into the general-purpose 
register. The operand-size attribute determines whether the pointer loaded is 32 or 48 bits in length. A 
64-bit operand is not supported. 

These instructions load associated segment-descriptor information into the hidden portion of the 
specified segment register. 


Mnemonic 

Opcode 

LDS reg16, mem16:16 

C5/r 

LDS reg32, mem 16:32 

C5/r 

LES reg16, mem16:16 

C4 /r 

LES reg32, mem 16:32 

C4 /r 

LFS reg16, mem16:16 

OF B4 /r 

LFS reg32, mem 16:32 

OF B4 /r 

LGS reg16, mem16:16 

OF B5 /r 

LGS reg32, mem 16:32 

OF B5 /r 

LSS reg16, mem16:16 

OF B2 /r 

LSS reg32, mem16:32 

OF B2 /r 


Related Instructions 

None 


Description 

Load DS:reg16 with afar pointer from memory. 
[Redefined as VEX (2-byte prefix) in 64-bit mode.] 

Load DS:reg32 with a far pointer from memory. 
[Redefined as VEX (2-byte prefix) in 64-bit mode.] 

Load ES:reg16 with a far pointer from memory. 
[Redefined as VEX (3-byte prefix) in 64-bit mode.] 

Load ES:reg32 with a far pointer from memory. 
[Redefined as VEX (3-byte prefix) in 64-bit mode.] 

Load FS:reg16 with a 32-bit far pointer from memory. 

Load FS:reg32 with a 48-bit far pointer from memory. 

Load GS:reg16 with a 32-bit far pointer from memory. 

Load GS:reg32 with a 48-bit far pointer from memory. 

Load SS \reg16 with a 32-bit far pointer from memory. 

Load SS \reg32 with a 48-bit far pointer from memory. 


rFLAGS Affected 

None 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

The source operand was a register. 



X 

LDS or LES was executed in 64-bit mode and not subject to 
interpretation as a VEX prefix. 

Segment not 
present, #NP 
(selector) 



X 

The DS, ES, FS, or GS register was loaded with a non-null 
segment selector and the segment was marked not present. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

Stack, #SS 
(selector) 



X 

The SS register was loaded with a non-null segment selector 
and the segment was marked not present. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 




X 

A segment register was loaded, but the segment descriptor 
exceeded the descriptor table limit. 




X 

A segment register was loaded and the segment selector’s Tl 
bit was set, but the LDT selector was a null selector. 




X 

The SS register was loaded with a null segment selector in 
non-64-bit mode or while CPL = 3. 

General protection, 
#GP 

(selector) 



X 

The SS register was loaded and the segment selector RPL 
and the segment descriptor DPL were not equal to the CPL. 



X 

The SS register was loaded and the segment pointed to was 
not a writable data segment. 




X 

The DS, ES, FS, or GS register was loaded and the segment 
pointed to was a data or non-conforming code segment, but 
the RPL or CPL was greater than the DPL. 




X 

The DS, ES, FS, or GS register was loaded and the segment 
pointed to was not a data segment or readable code segment. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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LEA Load Effective Address 

Computes the effective address of a memory location (second operand) and stores it in a general- 
purpose register (first operand). 

The address size of the memory location and the size of the register determine the specific action taken 
by the instruction, as follows: 

• If the address size and the register size are the same, the instruction stores the effective address as 
computed. 

• If the address size is longer than the register size, the instruction truncates the effective address to 
the size of the register. 

• If the address size is shorter than the register size, the instruction zero-extends the effective address 
to the size of the register. 

If the second operand is a register, an undefined-opcode exception occurs. 

The LEA instruction is related to the MOV instruction, which copies data from a memory location to a 
register, but LEA takes the address of the source operand, whereas MOV takes the contents of the 
memory location specified by the source operand. In the simplest cases, LEA can be replaced with 
MOV. For example: 

lea eax, [ebx] 

has the same effect as: 

mov eax, ebx 

However, LEA allows software to use any valid ModRM and SIB addressing mode for the source 
operand. For example: 

lea eax, [ebx+edi] 

loads the sum of the EBX and EDI registers into the EAX register. This could not be accomplished by 
a single MOV instruction. 

The LEA instruction has a limited capability to perform multiplication of operands in general-purpose 
registers using scaled-index addressing. For example: 

lea eax, [ebx+ebx*8] 

loads the value of the EBX register, multiplied by 9, into the EAX register. Possible values of 
multipliers are 2, 4, 8, 3, 5, and 9. 

The LEA instruction is widely used in string-processing and array-processing to initialize an index 
register (rSI or rDI) before performing string instructions such as MOVSx. It is also used to initialize 
the rBX register before performing the XLAT instruction in programs that perform character 
translations. In data structures, the LEA instruction can calculate addresses of operands stored in 
memory, and in particular, addresses of array or string elements. 
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LEA reg16, mem 

8D /r 

Store effective 

address in a 16-bit register. 

LEA reg32, mem 

8D /r 

Store effective 

address in a 32-bit register. 

LEA reg64, mem 

8D/r 

Store effective 

address in a 64-bit register. 


Related Instructions 

MOV 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

The source operand was a register. 
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LEAVE Delete Procedure Stack Frame 

Releases a stack frame created by a previous ENTER instruction. To release the frame, it copies the 
frame pointer (in the rBP register) to the stack pointer register (rSP), and then pops the old frame 
pointer from the stack into the rBP register, thus restoring the stack frame of the calling procedure. 

The 32-bit LEAVE instruction is equivalent to the following 32-bit operation: 

MOV ESP,EBP 
POP EBP 

To return program control to the calling procedure, execute a RET instruction after the LEAVE 
instruction. 

In 64-bit mode, the LEAVE operand size defaults to 64 bits, and there is no prefix available for 
encoding a 32-bit operand size. 


Mnemonic 

Opcode 

Description 

LEAVE 

C9 

Set the stack pointer register SP to the value in the BP 
register and pop BP. 

LEAVE 

C9 

Set the stack pointer register ESP to the value in the 
EBP register and pop EBP. 

(No prefix for encoding this in 64-bit mode.) 

LEAVE 

C9 

Set the stack pointer register RSP to the value in the 
RBP register and pop RBP. 


Related Instructions 

ENTER 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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LFENCE Load Fence 

Acts as a barrier to force strong memory ordering (serialization) between load instructions preceding 
the LFENCE and load instructions that follow the LFENCE. Loads from differing memory types may 
be performed out of order, in particular between WC/WC+ and other memory types. The LFENCE 
instruction assures that the system completes all previous loads before executing subsequent loads. 

The LFENCE instruction is weakly-ordered with respect to store instructions, data and instruction 
prefetches, and the SFENCE instruction. Speculative loads initiated by the processor, or specified 
explicitly using cache-prefetch instructions, can be reordered around an LFENCE. 

In addition to load instructions, the LFENCE instruction is strongly ordered with respect to other 
LFENCE instructions, as well as MFENCE and other serializing instructions. Further details on the 
use of MFENCE to order accesses among differing memory types may be found in AMD64 
Architecture Programmer s Manual Volume 2: System Programming, section 7.4 “Memory Types” on 
page 172. 

LFENCE is an SSE2 instruction. Support for SSE2 instructions is indicated by CPUID 
Fn0000_0001_EDX[SSE2] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Opcode 

LFENCE 0FAEE8 

Related Instructions 

| MFENCE, SFENCE, MCOMMIT 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

SSE2 instructions are not supported, as indicated by CPUID 
Fn0000_0001_EDX[SSE2] = 0. 


Description 

Force strong ordering of (serialize) load operations. 
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LLWPCB Load Lightweight Profiling Control Block 

Address 

Parses the Lightweight Profiling Control Block at the address contained in the specified register. If the 
LWPCB is valid, writes the address into the LWPCBADDR MSR and enables Lightweight Profiling. 

See Volume 2, Chapter 13, for an overview of the lightweight profiling facility. 

The LWPCB must be in memory that is readable and writable in user mode. For better perfonnance, it 
should be aligned on a 64-byte boundary in memory and placed so that it does not cross a page 
boundary, though neither of these suggestions is required. 

The LWPCB address in the register is truncated to 32 bits if the operand size is 32. 

Action 

1. If LWP is not available or if the machine is not in protected mode, LLWPCB immediately causes 
a #UD exception. 

2. If LWP is already enabled, the processor flushes the LWP state to memory in the old LWPCB. See 
description of the SLWPCB instruction on page 331 for details on saving the active LWP state. 

If the flush causes a #PF exception, LWP remains enabled with the old LWPCB still active. Note 
that the flush is done before LWP attempts to access the new LWPCB. 

3. If the specified LWPCB address is 0, LWP is disabled and the execution of LLWPCB is complete. 

4. The LWPCB address is non-zero. LLWPCB validates it as follows: 

If any part of the LWPCB or the ring buffer is beyond the data segment limit, LLWPCB causes 
a #GP exception. 

If the ring buffer size is below the implementation’s minimum ring buffer size, LLWPCB 
causes a #GP exception. 

While doing these checks, LWP reads and writes the LWPCB, which may cause a #PF 
exception. 

If any of these exceptions occurs, LLWPCB aborts and LWP is left disabled. Usually, the operating 
system will handle a #PF exception by making the memory available and returning to retry the 
LLWPCB instruction. The #GP exceptions indicate application programming errors. 

5. LWP converts the LWPCB address and the ring buffer address to linear address form by adding 
the DS base address and stores the addresses internally. 

6. LWP examines the LWPCB.Flags field to determine which events should be enabled and whether 
threshold interrupts should be taken. It clears the bits for any features that are not available and 
stores the result back to LWPCB .Flags to inform the application of the actual LWP state. 

7. For each event being enabled, LWP examines the Eventlntervaln value and, if necessary, sets it to 
an implementation-defined minimum. (The minimum event interval for LWPVAL is zero.) It 
loads its internal counter for the event from the value in EventCountern. A zero or negative value 
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in EventCountern means that the next event of that type will cause an event record to be stored. To 
count every / h event, a program should set Evcntlntcrval/; to j-1 and EventCountern to some 
starting value (where j-1 is a good initial count). If the counter value is larger than the interval, the 
first event record will be stored after a larger number of events than subsequent records. 

8. LWP is started. The execution of LLWPCB is complete. 

Notes 

If none of the bits in the LWPCB.Flags specifies an available event, LLWPCB still enables LWP to 
allow the use of the LWPINS instruction. However, no other event records will be stored. 

A program can temporarily disable LWP by executing SLWPCB to obtain the current LWPCB 
address, saving that value, and then executing LLWPCB with a register containing 0. It can later re¬ 
enable LWP by executing LLWPCB with a register containing the saved address. 

When LWP is enabled, it is typically an error to execute LLWPCB with the address of the active 
LWPCB. When the hardware flushes the existing LWP state into the LWPCB, it may overwrite fields 
that the application may have set to new LWP parameter values. The flushed values will then be loaded 
as LWP is restarted. To reuse an LWPCB, an application should stop LWP by passing a zero to 
LLWPCB, then prepare the LWPCB with new parameters and execute LLWPCB again to restart LWP. 

Internally, LWP keeps the linear address of the LWPCB and the ring buffer. If the application changes 
the value of DS, LWP will continue to collect samples even if the new DS value would no longer allow 
access the LWPCB or the ring buffer. However, a #GP fault will occur if the application uses XRSTOR 
to restore LWP state saved by XSAVE. Programs should avoid using XSAVE/XRSTOR on LWP state 
if DS has changed. This only applies when the CPL != 0; kernel mode operation of XRSTOR is 
unaffected by changes to DS. See instruction listing for XSAVE in Volume 4 for details. 

Operating system and hypervisor code that runs when CPL ^ 3 should use XSAVE and XRSTOR to 
control LWP rather than using LLWPCB. Use WRMSR to write 0 to the LWP CBADDR MSR to 
immediately stop LWP without saving its current state. 

It is possible to execute LLWPCB when the CPL != 3 or when SMM is active, but the system software 
must ensure that the LWPCB and the entire ring buffer are properly mapped into writable memory in 
order to avoid a #PF or #GP fault. Furthermore, if LWP is enabled when a kernel executes LLWPCB, 
both the old and new control blocks and ring buffers must be accessible. Using LLWPCB in these 
situations is not recommended. 

LLWPCB is an LWP instruction. Support for LWP instructions is indicated by CPUID 
Fn8000_0001_ECX[LWP] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 
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Instruction Encoding 

Mnemonic Encoding 



XOP 

RXB.map_select 

W.vvvv.L.pp 

Opcode 

LLWPCB reg32 

8F 

RXB.09 

0.1111.0.00 

12/0 

LLWPCB reg64 

8F 

RXB.09 

1.1111.0.00 

12/0 


ModRM.reg augments the opcode and is assigned the value 0. ModRM.r/m (augmented by XOP.R) 
specifies the register containing the effective address of the LWPCB. ModRM.mod is lib. 

Related Instructions 

SLWPCB, LWPVAL, LWPINS 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

LWP instructions are not supported, as indicated by CPUID 
Fn8000_0001_ECX[LWP] = 0. 

X 

X 


The system is not in protected mode. 



X 

LWP is not available, or mod != 11b, or vvvv != 1111b. 

General protection, 
#GP 



X 

Any part of the LWPCB or the event ring buffer is beyond the 
DS segment limit. 



X 

Any restrictions on the contents of the LWPCB are violated 

Page fault, #PF 



X 

A page fault resulted from reading or writing the LWPCB. 



X 

LWP was already enabled and a page fault resulted from 
reading or writing the old LWPCB. 



X 

LWP was already enabled and a page fault resulted from 
flushing an event to the old ring buffer. 
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LODS Load String 

LODSB 

LODSW 

LODSD 

LODSQ 

Copies the byte, word, doubleword, or quadword in the memory location pointed to by the DS:rSI 
registers to the AL, AX, EAX, or RAX register, depending on the size of the operand, and then 
increments or decrements the rSI register according to the state of the DF flag in the rFLAGS register. 

If the DF flag is 0, the instruction increments rSI; otherwise, it decrements rSI. It increments or 
decrements rSI by 1,2, 4, or 8, depending on the number of bytes being loaded. 

The forms of the FODS instruction with an explicit operand address the operand at seg:[rSI]. The 
value of seg defaults to the DS segment, but may be overridden by a segment prefix. The explicit 
operand serves only to specify the type (size) of the value being copied and the specific registers used. 

The no-operands forms of the instruction always use the DS:[rSI] registers to point to the value to be 
copied (they do not allow a segment prefix). The mnemonic determines the size of the operand and the 
specific registers used. 

The FODSx instructions support the REP prefixes. For details about the REP prefixes, see “Repeat 
Prefixes” on page 12. More often, software uses the FODSx instruction inside a loop controlled by a 
FOOPcc instruction as a more efficient replacement for instructions like: 

mov eax, dword ptr ds:[esi] 
add esi, 4 

The FODSQ instruction can only be used in 64-bit mode. 


Mnemonic 

Opcode 

Description 

LODS mem8 

AC 

Load byte at DS:rSI into AL and then increment or 
decrement rSI. 

LODS mem16 

AD 

Load word at DS:rSI into AX and then increment or 
decrement rSI. 

LODS mem32 

AD 

Load doubleword at DS:rSI into EAX and then 
increment or decrement rSI. 

LODS mem64 

AD 

Load quadword at DS:rSI into RAX and then increment 
or decrement rSI. 

LODSB 

AC 

Load byte at DS:rSI into AL and then increment or 
decrement rSI. 

LODSW 

AD 

Load the word at DS:rSI into AX and then increment or 
decrement rSI. 
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Mnemonic 

Opcode 

Description 

LODSD 

AD 

Load doubleword at DS:rSI into EAX and then 
increment or decrement rSI. 

LODSQ 

AD 

Load quadword at DS:rSI into RAX and then increment 
or decrement rSI. 


Related Instructions 

MOVSx, STOSx 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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LOOP Loop 

LOOPE 

LOOPNE 

LOOPNZ 

LOOPZ 

Decrements the count register (rCX) by 1, then, if rCX is not 0 and the ZF flag meets the condition 
specified by the mnemonic, it jumps to the target instruction specified by the signed 8-bit relative 
offset. Otherwise, it continues with the next instruction after the LOOPcc instruction. 

The size of the count register used (CX, ECX, or RCX) depends on the address-size attribute of the 
LOOPcc instruction. 

The LOOP instruction ignores the state of the ZF flag. 

The LOOPE and LOOPZ instructions jump if rCX is not 0 and the ZF flag is set to 1. In other words, 
the instruction exits the loop (falls through to the next instruction) if rCX becomes 0 or ZF = 0. 

The LOOPNE and LOOPNZ instructions jump if rCX is not 0 and ZF flag is cleared to 0. In other 
words, the instruction exits the loop if rCX becomes 0 or ZF = 1. 

The LOOPcc instruction does not change the state of the ZF flag. Typically, the loop contains a 
compare instruction to set or clear the ZF flag. 

If the jump is taken, the signed displacement is added to the rIP (of the following instruction) and the 
result is truncated to 16, 32, or 64 bits, depending on operand size. 

In 64-bit mode, the operand size defaults to 64 bits without the need for a REX prefix, and the 
processor sign-extends the 8-bit offset before adding it to the RIP. 


Mnemonic 

Opcode 

Description 

LOOP rel8off 

E2 cb 

Decrement rCX, then jump short if rCX is not 0. 

LOOPE rel8off 

El cb 

Decrement rCX, then jump short if rCX is not 0 and ZF is 
1. 

LOOPNE rel8off 

E0 cb 

Decrement rCX, then Jump short if rCX is not 0 and ZF 
is 0. 

LOOPNZ rel8off 

E0 cb 

Decrement rCX, then Jump short if rCX is not 0 and ZF 
is 0. 

LOOPZ rel8off 

El cb 

Decrement rCX, then Jump short if rCX is not 0 and ZF 
is 1. 


Related Instructions 

None 


214 


LOOPcc 


General-Purpose 
Instruction Reference 



24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 
#GP 

X 

X 

X 

The target offset exceeded the code segment limit or was non- 
canonical. 
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LWPINS Lightweight Profiling Insert Record 

Inserts programmed event record into the LWP event ring buffer in memory and advances the ring 
buffer pointer. 

Refer to the description of the programmed event record in Volume 2, Chapter 13. The record has an 
Eventld of255. The value in the register specified by wvv (first operand) is stored in the Data2 field at 
bytes 23-16 (zero extended if the operand size is 32). The value in a register or memory location 
(second operand) is stored in the Datal field at bytes 7-4. The immediate value (third operand) is 
truncated to 16 bits and stored in the Flags field at bytes 3-2. 

If the ring buffer is not full, or if LWP is running in Continuous Mode, the head pointer is advanced 
and the CF flag is cleared. If the ring buffer threshold is exceeded and threshold interrupts are enabled, 
an interrupt is signaled. If LWP is in Continuous Mode and the new head pointer equals the tail pointer, 
the MissedEvents counter is incremented to indicate that the buffer wrapped. 

If the ring buffer is full and LWP is running in Synchronized Mode, the event record overwrites the last 
record in the buffer, the MissedEvents counter in the LWPCB is incremented, the head pointer is not 
advanced, and the CF flag is set. 

LWPINS generates an invalid opcode exception (#UD) if the machine is not in protected mode or if 
LWP is not available. 

LWPINS simply clears CF if LWP is not enabled. This allows LWPINS instructions to be harmlessly 
ignored if profiling is turned off. 

It is possible to execute LWPINS when the CPL ^ 3 or when SMM is active, but the system software 
must ensure that the memory operand (if present), the LWPCB, and the entire ring buffer are properly 
mapped into writable memory in order to avoid a #PF or #GP fault. Using LWPINS in these situations 
is not recommended. 

LWPINS can be used by a program to mark significant events in the ring buffer as they occur. For 
instance, a program might capture information on changes in the process’ address space such as library 
loads and unloads, or changes in the execution environment such as a change in the state of a user¬ 
mode thread of control. 

Note that when the LWPINS instruction finishes writing a event record in the event ring buffer, it 
counts as an instruction retired. If the Instructions Retired event is active, this might cause that counter 
to become negative and immediately store another event record with the same instruction address (but 
different Eventld values). 

LWPINS is an LWP instruction. Support for LWP instructions is indicated by CPUID 
Fn8000_0001_ECX[LWP] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 
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Instruction Encoding 

Mnemonic Encoding 



XOP 

RXB.map_select 

W.vvvv.L.pp 

Opcode 

LWPINS reg32.vvvv, reg/mem32, imm32 

8F 

RXB.OA 

O.srcl.O.OO 

12/0 /imm32 

LWPINS reg64.vvvv, reg/mem32, imm32 

8F 

RXB.OA 

l.srcl.0.00 

12/0 /imm32 


ModRM.reg augments the opcode and is assigned the value 0. The {mod, r/m} field of the ModRM 
byte (augmented by XOP.R) encodes the second operand. A 4-byte immediate field follows ModRM. 

Related Instructions 

LLWPCB, SLWPCB, LWPVAL 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 

















M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

LWP instructions are not supported, as indicated by CPUID 
Fn8000_0001_ECX[LWP] = 0. 

X 

X 


The system is not in protected mode. 



X 

LWP is not available. 

Page fault, #PF 



X 

A page fault resulted from reading or writing the LWPCB. 



X 

A page fault resulted from writing the event to the ring buffer. 



X 

A page fault resulted from reading a modrm operand from 
memory. 

General protection, 
#GP 



X 

A modrm operand in memory exceeded the segment limit. 
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LWPVAL Lightweight Profiling Insert Value 

Decrements the event counter associated with the programmed value sample event (see “Programmed 
Value Sample” in Volume 2, Chapter 13). If the resulting counter value is negative, inserts an event 
record into the LWP event ring buffer in memory and advances the ring buffer pointer. 

Refer to the description of the programmed value sample record in Volume 2, Chapter 13. The event 
record has an Eventld of 1. The value in the register specified by vvvv (first operand) is stored in the 
Data2 field at bytes 23-16 (zero extended if the operand size is 32). The value in a register or memory 
location (second operand) is stored in the Datal field at bytes 7-4. The immediate value (third 
operand) is truncated to 16 bits and stored in the Flags field at bytes 3-2. 

If the programmed value sample record is not written to the event ring buffer, the memory location of 
the second operand (assuming it is memory-based) is not accessed. 

If the ring buffer is not full or if LWP is running in continuous mode, the head pointer is advanced and 
the event counter is reset to the interval for the event (subject to randomization). If the ring buffer 
threshold is exceeded and threshold interrupts are enabled, an interrupt is signaled. If LWP is in 
Continuous Mode and the new head pointer equals the tail pointer, the MissedEvents counter is 
incremented to indicate that the buffer wrapped. 

If the ring buffer is full and LWP is running in Synchronized Mode, the event record overwrites the last 
record in the buffer, the MissedEvents counter in the LWPCB is incremented, and the head pointer is 
not advanced. 

LWPVAL generates an invalid opcode exception (#UD) if the machine is not in protected mode or if 
LWP is not available. 

LWPVAL does nothing if LWP is not enabled or if the Programmed Value Sample event is not enabled 
in LWPCB .Flags. This allows LWPVAL instructions to be harmlessly ignored if profiling is turned off. 

It is possible to execute LWPVAL when the CPL != 3 or when SMM is active, but the system software 
must ensure that the memory operand (if present), the LWPCB, and the entire ring buffer are properly 
mapped into writable memory in order to avoid a #PF or #GP fault. Using LWPVAL in these situations 
is not recommended. 

LWPVAL can be used by a program to perform value profiling. This is the technique of sampling the 
value of some program variable at a predetermined frequency. For example, a managed runtime might 
use LWPVAL to sample the value of the divisor for a frequently executed divide instruction in order to 
determine whether to generate specialized code for a common division. It might sample the target 
location of an indirect branch or call to see if one destination is more frequent than others. Since 
LWPVAL does not modify any registers or condition codes, it can be inserted harmlessly between any 
instructions. 
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Note 

When LWPVAL completes (whether or not it stored an event record in the event ring buffer), it counts 
as an instruction retired. If the Instructions Retired event is active, this might cause that counter to 
become negative and immediately store an event record. If LWPVAL also stored an event record, the 
buffer will contain two records with the same instruction address (but different Eventld values). 

LWPVAL is an LWP instruction. Support for LWP instructions is indicated by CPUID 
Fn8000_0001_ECX[FWP] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 

Instruction Encoding 

Mnemonic Encoding 



XOP 

RXB.map_select 

W.vvvv.L.pp 

Opcode 

LWPVAL reg32. vvvv, reg/mem32, imm32 

8F 

RXB.OA 

O.srcl .0.00 

12/1 /imm32 

LWPVAL reg64.vvvv, reg/mem32, imm32 

8F 

RXB.OA 

l.srcl.0.00 

12/1 /imm32 


ModRM.reg augments the opcode and is assigned the value 001b. The {mod, r/m} field of the 
ModRM byte (augmented by XOP.R) encodes the second operand. A four-byte immediate field 
follows ModRM. 

Related Instructions 

LFWPCB, SFWPCB, FWPINS 

rFLAGS Affected 

None 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

LWP instructions are not supported, as indicated by CPUID 
Fn8000_0001_ECX[LWP] = 0. 

X 

X 


The system is not in protected mode. 



X 

LWP is not available. 

Page fault, #PF 



X 

A page fault resulted from reading or writing the LWPCB. 



X 

A page fault resulted from writing the event to the ring buffer. 



X 

A page fault resulted from reading a modrm operand from 
memory. 

General protection, 
#GP 



X 

A modrm operand in memory exceeded the segment limit. 
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LZCNT Count Leading Zeros 

Counts the number of leading zero bits in the 16-, 32-, or 64-bit general purpose register or memory 
source operand. Counting starts downward from the most significant bit and stops when the highest bit 
having a value of 1 is encountered or when the least significant bit is encountered. The count is written 
to the destination register. 

This instruction has two operands: 

LZCNT dest, src 

If the input operand is zero, CF is set to 1 and the size (in bits) of the input operand is written to the 
destination register. Otherwise, CF is cleared. 

If the most significant bit is a one, the ZF flag is set to 1, zero is written to the destination register. 
Otherwise, ZF is cleared. 

LZCNT is an Advanced Bit Manipulation (ABM) instruction. Support for the LZCNT instruction is 
indicated by CPUID Fn8000_0001_ECX[ABM] = 1. If the LZCNT instruction is not available, the 
encoding is interpreted as the BSR instruction. Software MUST check the CPUID bit once per 
program or library initialization before using the LZCNT instruction, or inconsistent behavior may 
result. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic 

Opcode 

Description 

LZCNT 

reg16, reg/mem16 

F3 OF BD /r 

Count the number of leading zeros in reg/mem16 

LZCNT 

reg32, reg/mem32 

F3 OF BD /r 

Count the number of leading zeros in reg/mem32 

LZCNT 

reg64, reg/mem64 

F3 OF BD /r 

Count the number of leading zeros in reg/mem64 

Related 

Instructions 




ANDN, BEXTR, BLCI, BLCIC, BLCMSK, BLCS, BLSFILL, BLSI, BLSIC, BLSR, BLSMSK, BSF, 
BSR, POPCNT, T1MSKC, TZCNT, TZMSK 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









U 




U 

M 

U 

U 


21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Mode 

Cause of Exception 

Real 

Virtual 

8086 

Protected 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, 

#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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MCOMMIT Commit Stores to Memory 

MCOMMIT provides a fencing and error detection capability for stores to system memory 
components that have delayed error reporting. Execution of MCOMMIT ensures that any preceding 
stores in the thread to such memory components have completed (target locations written, unless 
inhibited by an error condition) and that any errors encountered by those stores have been signaled to 
associated error logging resources. If any such errors are present, MCOMMIT will clear rFLAGS.CF 
to zero, otherwise it will set rFFAGS.CF to one. 

These errors are specific to the design of the platform and are reported only via MCOMMIT and in 
associated error logging registers on the platform; they are not visible to the Machine Check 
Architecture. Execution of MCOMMIT does not change any state in the error logging resources. Any 
error indications will need to be cleared by privileged software before MCOMMIT can return an error- 
free indication. Details on the error logging mechanisms may be found in the Processor Programming 
Reference manual for any product that supports this technology and the MCOMMIT instruction. 

The MCOMMIT instruction is supported if the feature flag CPUID Fn8000_0008_EBX[MCOMMIT] 
= 1 (bit 8). The MCOMMIT instruction must be explicitly enabled by the OS by setting 
EFER.MCOMMIT=l (EFER bit 17), otherwise attempted execution of MCOMMIT will result in a 
#UD exception. 

MCOMMIT uses the same ordering rules as the SFENCE instruction. It may be executed at any 
privilege level. 

When executed in a guest VM, a hypervisor may intercept this instruction by setting bit 3 of vector 3 
(offset 14h) to 1. 

Instruction Encoding 

Mnemonic Opcode Description 

| MCOMMIT F3 OF 01 FA Commit stores to memory 

Related Instructions 

I FFENCE, SFENCE, MFENCE 


rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




0 

0 

0 

0 


21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 
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MFENCE Memory Fence 

Acts as a barrier to force strong memory ordering (serialization) between load and store instructions 
preceding the MFENCE, and load and store instructions that follow the MFENCE. The processor may 
perform loads out of program order with respect to non-conflicting stores for certain memory types. 
The MFENCE instruction ensures that the system completes all previous memory accesses before 
executing subsequent accesses. 

The MFENCE instruction is weakly-ordered with respect to data and instruction prefetches. 
Speculative loads initiated by the processor, or specified explicitly using cache-prefetch instructions, 
can be reordered around an MFENCE. 

In addition to load and store instructions, the MFENCE instruction is strongly ordered with respect to 
other MFENCE instructions, LFENCE instructions, SFENCE instructions, serializing instructions, 
and CLFLUSH instructions. Further details on the use of MFENCE to order accesses among differing 
memory types may be found in AMD64 Architecture Programmer s Manual Volume 2: System 
Programming, section 7.4 “Memory Types” on page 172. 

The MFENCE instruction is a serializing instruction. 

MFENCE is an SSE2 instruction. Support for SSE2 instructions is indicated by CPUID 
FnOOOOOOO 1_EDX[SSE2] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 

Instruction Encoding 

Mnemonic Opcode 

MFENCE 0FAEF0 

Related Instructions 

| LFENCE, SFENCE, MCOMMIT 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

SSE2 instructions are not supported, as indicated by CPUID 
Fn0000_0001_EDX[SSE2] = 0. 


Description 

Force strong ordering of (serialized) load and store 
operations. 
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MONITORX Setup Monitor Address 

Establishes a linear address range of memory for hardware to monitor and puts the processor in the 
monitor event pending state. When in the monitor event pending state, the monitoring hardware 
detects stores to the specified linear address range and causes the processor to exit the monitor event 
pending state. The MWAIT and MWAITX instructions use the state of the monitor hardware. 

The address range should be a write-back memory type. Executing MONITORX on an address range 
for a non-write-back memory type is not guaranteed to cause the processor to enter the monitor event 
pending state. The size of the linear address range that is established by the MONITORX instruction 
can be determined by CPUID function 0000_0005h. 

The [rAX] register provides the effective address. The DS segment is the default segment used to 
create the linear address. Segment overrides may be used with the MONITORX instruction. 

The ECX register specifies optional extensions for the MONITORX instruction. There are currently 
no extensions defined and setting any bits in ECX will result in a #GP exception. The ECX register 
operand is implicitly 32-bits. 

The EDX register specifies optional hints for the MONITORX instruction. There are currently no 
hints defined and EDX is ignored by the processor. The EDX register operand is implicitly 32-bits. 

The MONITORX instruction can be executed at any privilege level and MSR 

C001_0015h [MonMwaitUserEn] has no effect on MONITORX. 

MONITORX performs the same segmentation and paging checks as a 1-byte read. 

Support for the MONITORX instruction is indicated by CPUID FnOOOO_OOOi_ECX[MONITORX] = 
1 . 

Software must check the CPUID bit once per program or library initialization before using the 
MONITORX instruction, or inconsistent behavior may result. 

The following pseudo-code shows typical usage of a MONITORX/MWAITX pair: 

EAX = Linear_Address_to Monitor; 

ECX =0; // Extensions 

EDX =0; // Hints 

while (!matching_store done){ 

MONITORX EAX, ECX, EDX 
IF (!matching^store done) { 

MWAITX EAX, ECX 
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Mnemonic Opcode Description 

MONITORX OF 01 FA Establishes a range to be monitored 

Related Instructions 

MWAITX, MONITOR, MWAIT 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

MONITORX/MWAITX instructions are not supported, as 
indicated by CPUID Fn0000_0001_ECX[MONITORX] =0 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical 

X 

X 

X 

ECX was non-zero 



X 

A null data segment was used to reference memory 

Page Fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction 
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MOV Move 

Copies an immediate value or the value in a general-purpose register, segment register, or memory 
location (second operand) to a general-purpose register, segment register, or memory location. The 
source and destination must be the same size (byte, word, doubleword, or quadword) and cannot both 
be memory locations. 

In opcodes AO through A3, the memory offsets (called moffsets ) are address sized. In 64-bit mode, 
memory offsets default to 64 bits. Opcodes A0-A3, in 64-bit mode, are the only cases that support a 
64-bit offset value. (In all other cases, offsets and displacements are a maximum of 32 bits.) The B8 
through BF (B8 +rq) opcodes, in 64-bit mode, are the only cases that support a 64-bit immediate value 
(in all other cases, immediate values are a maximum of 32 bits). 

When reading segment-registers with a 32-bit operand size, the processor zero-extends the 16-bit 
selector results to 32 bits. When reading segment-registers with a 64-bit operand size, the processor 
zero-extends the 16-bit selector to 64 bits. If the destination operand specifies a segment register (DS, 
ES, FS, GS, or SS), the source operand must be a valid segment selector. 

It is possible to move a null segment selector value (0000-0003h) into the DS, ES, FS, or GS register. 
This action does not cause a general protection fault, but a subsequent reference to such a segment 
does cause a #GP exception. For more information about segment selectors, see “Segment Selectors 
and Registers” in Volume 2. 

When the MOV instruction is used to load the SS register, the processor blocks external interrupts until 
after the execution of the following instruction. This action allows the following instruction to be a 
MOV instruction to load a stack pointer into the ESP register (MOV ESP, val) before an interrupt 
occurs. However, the LSS instruction provides a more efficient method of loading SS and ESP. 

Attempting to use the MOV instruction to load the CS register generates an invalid opcode exception 
(#UD). Use the far JMP, CALL, or RET instructions to load the CS register. 

To initialize a register to 0, rather than using a MOV instruction, it may be more efficient to use the 
XOR instruction with identical destination and source operands. 


Mnemonic 

Opcode 

Description 

MOV reg/mem8, reg8 

88 /r 

Move the contents of an 8-bit register to an 8-bit 
destination register or memory operand. 

MOV reg/mem16, reg16 

89 /r 

Move the contents of a 16-bit register to a 16-bit 
destination register or memory operand. 

MOV reg/mem32, reg32 

89 /r 

Move the contents of a 32-bit register to a 32-bit 
destination register or memory operand. 

MOV reg/mem64, reg64 

89 /r 

Move the contents of a 64-bit register to a 64-bit 
destination register or memory operand. 

MOV reg8, reg/mem8 

8 A /r 

Move the contents of an 8-bit register or memory 
operand to an 8-bit destination register. 
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Mnemonic 

MOV reg16, reg/mem16 

MOV reg32, reg/mem32 

MOV reg64, reg/mem64 

MOV reg 16/32/64/mem 16, 
segReg 

MOV segReg, reg/mem16 
MOV AL, moffset8 
MOV AX, moffsetl 6 
MOV EAX, moffset32 
MOV RAX, moffset64 
MOV moffset8, AL 
MOV moffsetl6, AX 
MOV moffset32, EAX 

MOV moffset64, RAX 

MOV reg8, imm8 
MOV reg 16, imm16 
MOV reg32, imm32 
MOV reg64 , imm64 

MOV reg/mem8, imm8 
MOV reg/mem16, imm16 
MOV reg/mem32, imm32 
MOV reg/mem64, imm32 


Opcode 

8B/r 

8B/r 

8B/r 

8C /r 

8E/r 

AO 

A1 

A1 

A1 

A2 

A3 

A3 

A3 

BO +rb ib 
B8 +rw iw 
B8 +rd id 
B8 +rq iq 

C6 10 ib 
Cl 10 iw 
C7 /0 id 
C7 /0 id 


Description 

Move the contents of a 16-bit register or memory 
operand to a 16-bit destination register. 

Move the contents of a 32-bit register or memory 
operand to a 32-bit destination register. 

Move the contents of a 64-bit register or memory 
operand to a 64-bit destination register. 

Move the contents of a segment register to a 16-bit, 32- 
bit, or 64-bit destination register or to a 16-bit memory 
operand. 

Move the contents of a 16-bit register or memory 
operand to a segment register. 

Move 8-bit data at a specified memory offset to the AL 
register. 

Move 16-bit data at a specified memory offset to the AX 
register. 

Move 32-bit data at a specified memory offset to the 
EAX register. 

Move 64-bit data at a specified memory offset to the 
RAX register. 

Move the contents of the AL register to an 8-bit memory 
offset. 

Move the contents of the AX register to a 16-bit memory 
offset. 

Move the contents of the EAX register to a 32-bit 
memory offset. 

Move the contents of the RAX register to a 64-bit 
memory offset. 

Move an 8-bit immediate value into an 8-bit register. 

Move a 16-bit immediate value into a 16-bit register. 

Move an 32-bit immediate value into a 32-bit register. 

Move an 64-bit immediate value into a 64-bit register. 

Move an 8-bit immediate value to an 8-bit register or 
memory operand. 

Move a 16-bit immediate value to a 16-bit register or 
memory operand. 

Move a 32-bit immediate value to a 32-bit register or 
memory operand. 

Move a 32-bit signed immediate value to a 64-bit 
register or memory operand. 
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Related Instructions 

MOV CR n, MOV DR n, MOVD, MOVSX, MOVZX, MOVSXD, MOVSx 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

An attempt was made to load the CS register. 

Segment not 
present, #NP 
(selector) 



X 

The DS, ES, FS, or GS register was loaded with a non-null 
segment selector and the segment was marked not present. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

Stack, #SS 
(selector) 



X 

The SS register was loaded with a non-null segment selector, 
and the segment was marked not present. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

General protection, 
#GP 

(selector) 



X 

A segment register was loaded, but the segment descriptor 
exceeded the descriptor table limit. 



X 

A segment register was loaded and the segment selector’s Tl 
bit was set, but the LDT selector was a null selector. 



X 

The SS register was loaded with a null segment selector in 
non-64-bit mode or while CPL = 3. 



X 

The SS register was loaded and the segment selector RPL 
and the segment descriptor DPL were not equal to the CPL. 



X 

The SS register was loaded and the segment pointed to was 
not a writable data segment. 



X 

The DS, ES, FS, or GS register was loaded and the segment 
pointed to was a data or non-conforming code segment, but 
the RPL or CPL was greater than the DPL. 



X 

The DS, ES, FS, or GS register was loaded and the segment 
pointed to was not a data segment or readable code segment. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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MOVBE Move Big Endian 

Loads or stores a general purpose register while swapping the byte order. Operates on 16-bit, 32-bit, or 
64-bit values. Converts big-endian formatted memory data to little-endian format when loading a 
register and reverses the conversion when storing a GPR to memory. 

The load form reads a 16-, 32-, or 64-bit value from memory, swaps the byte order, and places the 
reordered value in a general-purpose register. When the operand size is 16 bits, the upper word of the 
destination register remains unchanged. In 64-bit mode, when the operand size is 32 bits, the upper 
doubleword of the destination register is cleared. 

The store fonn takes a 16-, 32-, or 64-bit value from a general-purpose register, swaps the byte order, 
and stores the reordered value in the specified memory location. The contents of the source GPR 
remains unchanged. 

In the 16-bit swap, the upper and lower bytes are exchanged. In the doubleword swap operation, bits 
7:0 are exchanged with bits 31:24 and bits 15:8 are exchanged with bits 23:16. In the quadword swap 
operation, bits 7:0 are exchanged with bits 63:56, bits 15:8 with bits 55:48, bits 23:16 with bits 47:40, 
and bits 31:24 with bits 39:32. 

Support for the MOVBE instruction is indicated by CPUID FnOOOO_OOOi_ECX[MOVBE] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 

Instruction Encoding 


Mnemonic 

Opcode 

Description 

MOVBE reg16, mem16 

OF 38 F0 /r 

Load the low word of a general-purpose register from a 
16-bit memory location while swapping the bytes. 

MOVBE reg32, mem32 

OF 38 F0 /r 

Load the low doubleword of a general-purpose register 
from a 32-bit memory location while swapping the bytes 

MOVBE reg64, mem64 

OF 38 F0 /r 

Load a 64-bit register from a 64-bit memory location 
while swapping the bytes. 

MOVBE mem16, reg16 

OF 38 FI /r 

Store the low word of a general-purpose register to a 
16-bit memory location while swapping the bytes. 

MOVBE mem32, reg32 

OF 38 FI /r 

Store the low doubleword of a general-purpose register 
to a 32-bit memory location while swapping the bytes. 

MOVBE mem64, reg64 

OF 38 FI /r 

Store the contents of a 64-bit general-purpose register 
to a 64-bit memory location while swapping the bytes. 

Related Instruction 
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rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protect 

ed 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

Instruction not supported as indicated by CPUID 
FnOOOO_OOC)1_ECX[MOVBE] = 0. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was non- 
canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while alignment 
checking was enabled. 
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MOVD Move Doubleword or Quadword 

Moves a 32-bit or 64-bit value in one of the following ways: 

• from a 32-bit or 64-bit general-purpose register or memory location to the low-order 32 or 64 bits 
of an XMM register, with zero-extension to 128 bits 

• from the low-order 32 or 64 bits of an XMM to a 32-bit or 64-bit general-purpose register or 
memory location 

• from a 32-bit or 64-bit general-purpose register or memory location to the low-order 32 bits (with 
zero-extension to 64 bits) or the full 64 bits of an MMX register 

• from the low-order 32 or the full 64 bits of an MMX register to a 32-bit or 64-bit general-purpose 
register or memory location 

Figure 3-1 on page 233 illustrates the operation of the MOVD instruction. 

The MOVD instruction form that moves data to or from MMX registers is part of the MMX instruction 
subset. Support for MMX instructions is indicated by CPUID Fn0000_0001_EDX[MMX] or 
FnOOOOOOO 1_EDX[MMX] = 1. 

The MOVD instruction form that moves data to or from XMM registers is part of the SSE2 instruction 
subset. Support for SSE2 instructions is indicated by CPUID Fn0000_0001_EDX[SSE2] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 
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xmm 


127 


32 31 T 0 


reg/mem32 

31 0 


0 





xmm 


All operations 
are "copy" 


reg/mem64 


127 64 63 1 0 

63 

0 


0 








with REX prefix 


reg/mem32 


xmm 


31 1 0 

127 32 31 

0 










reg/mem64 


63 


1 


0 127 


xmm 

64 63 


with REX prefix 


mmx 


63 32 31 


1 0 


reg/mem32 

31 0 


0 





mmx 


63 


reg/mem64 

63 0 


with REX prefix 


reg/mem32 

~31 ^ o 


mmx 

63 32 31 0 






| 

reg/mem64 

63 1 0 

mmx 

63 0 





with REX prefix 

Figure 3-1. MOVD Instruction Operation 
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Instruction Encoding 



Mnemonic 

Opcode 

Description 

MOVD xmm, reglmem32 

66 OF 6E /r 

Move 32-bit value from a general-purpose register or 
32-bit memory location to an XMM register. 

MOVD 1 xmm, reglmem64 

66 OF 6E /r 

Move 64-bit value from a general-purpose register or 
64-bit memory location to an XMM register. 

MOVD reglmem32, xmm 

66 OF 7E /r 

Move 32-bit value from an XMM register to a 32-bit 
general-purpose register or memory location. 

MOVD* reglmem64, xmm 

66 OF 7E /r 

Move 64-bit value from an XMM register to a 64-bit 
general-purpose register or memory location. 

MOVD mmx, reglmem32 

OF 6E /r 

Move 32-bit value from a general-purpose register or 
32-bit memory location to an MMX register. 

MOVD mmx, reglmem64 

OF 6E /r 

Move 64-bit value from a general-purpose register or 
64-bit memory location to an MMX register. 

MOVD reglmem32, mmx 

OF 7E /r 

Move 32-bit value from an MMX register to a 32-bit 
general-purpose register or memory location. 

MOVD reglmem64, mmx 

OF 7E /r 

Move 64-bit value from an MMX register to a 64-bit 
general-purpose register or memory location. 

Note: 1. Also known as MOVQ in some developer tools. 


Related Instructions 

MOVDQA, MOVDQU, MOVDQ2Q, MOVQ, MOVQ2DQ 

rFLAGS Affected 

None 

MXCSR Flags Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Description 

Invalid opcode, #UD 

X 

X 

X 

MMX instructions are not supported, as indicated by 
CPUID FnOOOO 0001 EDX[MMX] or 
Fn0000_0001_EDX[MMX] = 0. 

X 

X 

X 

SSE2 instructions are not supported, as indicated by 
CPUID Fn0000_0001_EDX[SSE2] = 0. 

X 

X 

X 

The emulate bit (EM) of CR0 was set to 1. 

X 

X 

X 

The instruction used XMM registers while 
CR4.0SFXSR = 0. 

Device not available, 
#NM 

X 

X 

X 

The task-switch bit (TS) of CR0 was set to 1. 
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Exception 

Real 

Virtual 

8086 

Protected 

Description 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit 
or was non-canonical. 

General protection, 

#GP 

X 

X 

X 

A memory address exceeded a data segment limit or 
was non-canonical. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the 
instruction. 

x87 floating-point 
exception pending, 

#MF 

X 

X 

X 

An x87 floating-point exception was pending and the 
instruction referenced an MMX register. 

Alignment check, #AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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MOVMSKPD Extract Packed Double-Precision 

Floating-Point Sign Mask 

Moves the sign bits of two packed double-precision floating-point values in an XMM register (second 
operand) to the two low-order bits of a general-purpose register (first operand) with zero-extension. 

The function of the MOVMSKPD instruction is illustrated by the diagram below: 

reg32 xmm 



The MOVMSKPD instruction is an SSE2 instruction. Support for SSE2 instructions is indicated by 
CPUID Fn0000_0001_EDX[SSE2] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 

Instruction Encoding 

Mnemonic Opcode Description 

MOVMSKPD reg32, xmm 66 OF 50 /r Move s |9 n bits 127 anb 63 in an XMM register to a 32-bit 

muvmuiM ^ general-purpose register. 

Related Instructions 

MOVMSKPS, PMOVMSKB 

rFLAGS Affected 

None 

MXCSR Flags Affected 

None 
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Exceptions 


Exception (vector) 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

SSE2 instructions are not supported, as indicated by 
CPUID Fn0000_0001_EDX[SSE2] = 0. 

X 

X 

X 

The operating-system FXSAVE/FXRSTOR support bit 
(OSFXSR) of CR4 was cleared to 0. 

X 

X 

X 

The emulate bit (EM) of CRO was set to 1. 

Device not available, 
#NM 

X 

X 

X 

The task-switch bit (TS) of CRO was set to 1. 
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MOVMSKPS Extract Packed Single-Precision 

Floating-Point Sign Mask 

Moves the sign bits of four packed single-precision floating-point values in an XMM register (second 
operand) to the four low-order bits of a general-purpose register (first operand) with zero-extension. 

The MOVMSKPD instruction is an SSE2 instruction. Support for SSE2 instructions is indicated by 
CPUID Fn0000_0001_EDX[SSE2] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Opcode 

MOVMSKPS reg32, xmm OF 50 /r 


Description 

Move sign bits 127, 95, 63, 31 in an XMM register to a 
32-bit general-purpose register. 


reg32 xmm 




31 3 0 127 95 63 31 0 


0 

□ 






1 1 

1 1 


copy sign copy sign copy sign copy sign 


movmskps.eps 


Related Instructions 

MOVMSKPD, PMOVMSKB 

rFLAGS Affected 

None 

MXCSR Flags Affected 

None 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

SSE2 instructions are not supported, as indicated by 
CPUID Fn0000_0001_EDX[SSE2] = 0. 

X 

X 

X 

The operating-system FXSAVE/FXRSTOR support bit 
(OSFXSR) of CR4 was cleared to 0. 

X 

X 

X 

The emulate bit (EM) of CRO was set to 1. 

Device not available, 
#NM 

X 

X 

X 

The task-switch bit (TS) of CRO was set to 1. 
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MOVNTI Move Non-Temporal Doubleword or 

Quadword 

Stores a value in a 32-bit or 64-bit general-purpose register (second operand) in a memory location 
(first operand). This instruction indicates to the processor that the data is non-temporal and is unlikely 
to be used again soon. The processor treats the store as a write-combining (WC) memory write, which 
minimizes cache pollution. The exact method by which cache pollution is minimized depends on the 
hardware implementation of the instruction. For further information, see “Memory Optimization” in 
Volume 1. 

The MOVNTI instruction is weakly-ordered with respect to other instructions that operate on memory. 
Software should use an SFENCE instruction to force strong memory ordering of MOVNTI with 
respect to other stores. 

The MOVNTI instruction is an SSE2 instruction. Support for SSE2 instructions is indicated by 
CPUID Fn0000_0001_EDX[SSE2] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic 

Opcode 

Description 


MOVNTI mem32, reg32 

OF C3 /r 

Stores a 32-bit general-purpose 
bit memory location, minimizing 

register value into a 32- 
cache pollution. 

MOVNTI mem64, reg64 

OF C3 /r 

Stores a 64-bit general-purpose 
bit memory location, minimizing 

register value into a 64- 
cache pollution. 

Related Instructions 





MOVNTDQ, MOVNTPD, MOVNTPS, MOVNTQ 


rFLAGS Affected 

None 


Exceptions 


Exception (vector) 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

SSE2 instructions are not supported, as indicated by 
CPUID Fn0000_0001_EDX[SSE2] = 0. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit 
or was non-canonical. 
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Exception (vector) 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

General protection, 

#GP 

X 

X 

X 

A memory address exceeded a data segment limit or 
was non-canonical. 



X 

A null data segment was used to reference memory. 



X 

The destination operand was in a non-writable 
segment. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the 
instruction. 

Alignment check, #AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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MOVS Move String 

MOVSB 

MOVSW 

MOVSD 

MOVSQ 

Moves a byte, word, doubleword, or quadword from the memory location pointed to by DS:rSI to the 
memory location pointed to by ES:rDI, and then increments or decrements the rSI and rDI registers 
according to the state of the DF flag in the rFLAGS register. 

If the DF flag is 0, the instruction increments both pointers; otherwise, it decrements them. It 
increments or decrements the pointers by 1,2, 4, or 8, depending on the size of the operands. 

The forms of the MOVSx instruction with explicit operands address the first operand at ,seg:[rSI]. The 
value of seg defaults to the DS segment, but can be overridden by a segment prefix. These instructions 
always address the second operand at ES:[rDI] (ES may not be overridden). The explicit operands 
serve only to specify the type (size) of the value being moved. 

The no-operands forms of the instruction use the DS:[rSI] and ES:[rDI] registers to point to the value 
to be moved (they do not allow a segment prefix). The mnemonic determines the size of the operands. 

Do not confuse this MOVSD instruction with the same-mnemonic MOVSD (move scalar double¬ 
precision floating-point) instruction in the 128-bit media instruction set. Assemblers can distinguish 
the instructions by the number and type of operands. 

The MOVSx instructions support the REP prefixes. For details about the REP prefixes, see “Repeat 
Prefixes” on page 12. 


Mnemonic Opcode 


MOVS mem8, mem8 

A4 

MOVS mem 16, mem 16 

A5 

MOVS mem32, mem32 

A5 

MOVS mem64, mem64 

A5 

MOVSB 

A4 

MOVSW 

A5 


Description 

Move byte at DS:rSI to ES:rDI, and then increment or 
decrement rSI and rDI. 

Move word at DS:rSI to ES:rDI, and then increment or 
decrement rSI and rDI. 

Move doubleword at DS:rSI to ES:rDI, and then 
increment or decrement rSI and rDI. 

Move quadword at DS:rSI to ES:rDI, and then increment 
or decrement rSI and rDI. 

Move byte at DS:rSI to ES:rDI, and then increment or 
decrement rSI and rDI. 

Move word at DS:rSI to ES:rDI, and then increment or 
decrement rSI and rDI. 
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Description 

Move doubleword at DS:rSI to ES:rDI, and then 
increment or decrement rSI and rDI. 

Move quadword at DS:rSI to ES:rDI, and then increment 
or decrement rSI and rDI. 

Related Instructions 

MOV, LODSx, STOSx 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 


Mnemonic 

Opcode 

MOVSD 

A5 

MOVSQ 

A5 
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MOVSX Move with Sign-Extension 

Copies the value in a register or memory location (second operand) into a register (first operand), 
extending the most significant bit of an 8-bit or 16-bit value into all higher bits in a 16-bit, 32-bit, or 
64-bit register. 


Mnemonic 

Opcode 

Description 

MOVSX reg16, reg/mem8 

OF BE /r 

Move the contents of an 8-bit register or memory 
location to a 16-bit register with sign extension. 

MOVSX reg32, reg/mem8 

OF BE /r 

Move the contents of an 8-bit register or memory 
location to a 32-bit register with sign extension. 

MOVSX reg64, reg/mem8 

OF BE /r 

Move the contents of an 8-bit register or memory 
location to a 64-bit register with sign extension. 

MOVSX reg32, reg/mem16 

OF BF/r 

Move the contents of an 16-bit register or memory 
location to a 32-bit register with sign extension. 

MOVSX reg64, reg/mem16 

OF BF/r 

Move the contents of an 16-bit register or memory 
location to a 64-bit register with sign extension. 

Related Instructions 




MOVSXD, MOVZX 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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MOVSXD Move with Sign-Extend Doubleword 

Copies the 32-bit value in a register or memory location (second operand) into a 64-bit register (first 
operand), extending the most significant bit of the 32-bit value into all higher bits of the 64-bit register. 

This instruction requires the REX prefix 64-bit operand size bit (REX.W) to be set to 1 to sign-extend 
a 32-bit source operand to a 64-bit result. Without the REX operand-size prefix, the operand size will 
be 32 bits, the default for 64-bit mode, and the source is zero-extended into a 64-bit register. With a 16- 
bit operand size, only 16 bits are copied, without modifying the upper 48 bits in the destination. 

This instruction is available only in 64-bit mode. In legacy or compatibility mode this opcode is 
interpreted as ARPL. 


Mnemonic Opcode 

MOVSXD reg64, reg/mem32 63 /r 

Related Instructions 

MOVSX, MOVZX 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 



X 

A memory address was non-canonical. 

General protection, 
#GP 



X 

A memory address was non-canonical. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 


Description 

Move the contents of a 32-bit register or memory 
operand to a 64-bit register with sign extension. 
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MOVZX Move with Zero-Extension 

Copies the value in a register or memory location (second operand) into a register (first operand), zero¬ 
extending the value to fit in the destination register. The operand-size attribute determines the size of 
the zero-extended value. 


Mnemonic 

Opcode 

Description 

MOVZX reg16, reg/mem8 

OF B6 /r 

Move the contents of an 8-bit register or memory 
operand to a 16-bit register with zero-extension. 

MOVZX reg32, reg/mem8 

OF B6 /r 

Move the contents of an 8-bit register or memory 
operand to a 32-bit register with zero-extension. 

MOVZX reg64, reg/mem8 

OF B6 /r 

Move the contents of an 8-bit register or memory 
operand to a 64-bit register with zero-extension. 

MOVZX reg32, reg/mem16 

OF B7 /r 

Move the contents of a 16-bit register or memory 
operand to a 32-bit register with zero-extension. 

MOVZX reg64, reg/mem16 

OF B7 /r 

Move the contents of a 16-bit register or memory 
operand to a 64-bit register with zero-extension. 

Related Instructions 




MOVSXD, MOVSX 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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MUL Unsigned Multiply 

Multiplies the unsigned byte, word, doubleword, or quadword value in the specified register or 
memory location by the value in AL, AX, EAX, or RAX and stores the result in AX, DX:AX, 
EDXiEAX, or RDX:RAX (depending on the operand size). It puts the high-order bits of the product in 
AH, DX, EDX, or RDX. 

If the upper half of the product is non-zero, the instruction sets the carry flag (CF) and overflow flag 
(OF) both to 1. Otherwise, it clears CF and OF to 0. The other arithmetic flags (SF, ZF, AF, PF) are 
undefined. 


Mnemonic 

Opcode 

Description 

MUL reg/mem8 

F6 14 

Multiplies an 8-bit register or memory operand by the 
contents of the AL register and stores the result in the 
AX register. 

MUL reg/mem16 

F7 14 

Multiplies a 16-bit register or memory operand by the 
contents of the AX register and stores the result in the 
DX:AX register. 

MUL reg/mem32 

F7 14 

Multiplies a 32-bit register or memory operand by the 
contents of the EAX register and stores the result in the 
EDX:EAX register. 

MUL reg/mem64 

F7 14 

Multiplies a 64-bit register or memory operand by the 
contents of the RAX register and stores the result in the 
RDX:RAX register. 


Related Instructions 

DIV 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




U 

U 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference is performed while alignment 
checking was enabled. 
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MULX Multiply Unsigned 

Computes the unsigned product of the specified source operand and the implicit source operand rDX. 
Writes the upper half of the product to the first destination and the lower half to the second. Does not 
affect the arithmetic flags. 

This instruction has three operands: 

MULX destl, dest2, src 

In 64-bit mode, the operand size is determined by the value of VEX. W. If VEX. W is 1, the operand 
size is 64 bits; if VEX. W is 0, the operand size is 32 bits. In 32-bit mode, VEX.W is ignored. 16-bit 
operands are not supported. 

The first and second operands (destl and dest2) are general purpose registers. The specified source 
operand (src) is either a general purpose register or a memory operand. If the first and second operands 
specify the same register, the register receives the upper half of the product. 

This instruction is a BMI2 instruction. Support for this instruction is indicated by CPUID 
Fn0000_0007_EBX_x0[BMI2] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



VEX 

RXB.mapselect 

W.vvvv.L.pp 

Opcode 

MULX reg32, reg32, reg/mem32 

C4 

RXB.02 

0.ctesf2.0.11 

F6 /r 

MULX reg64, reg64, reg/mem64 

C4 

RXB.02 

1.desf2.0.11 

F6 /r 


Related Instructions 


rFLAGS Affected 

None. 


Exceptions 


Exception 

Mode 

Cause of Exception 

Real 

Virtual 

8086 

Protected 

Invalid opcode, #UD 

X 

X 


BMI2 instructions are only recognized in protected mode. 



X 

BMI2 instructions are not supported, as indicated by 
CPUID Fn0000_0007_EBX_x0[BMI2] = 0. 



X 

VEX.Lis 1. 
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Exception 

Mode 

Cause of Exception 

Real 

Virtual 

8086 

Protected 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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MWAITX Monitor Wait with Timeout 

Used in conjunction with the MONITORX instruction to cause a processor to wait until a store occurs 
to a specific linear address range from another processor or the timer expires. The previously executed 
MONITORX instruction causes the processor to enter the monitor event pending state. The MWAITX 
instruction may enter an implementation dependent power state until the monitor event pending state 
is exited. The MWAITX instruction has the same effect on architectural state as the NOP instruction. 

Events that cause an exit from the monitor event pending state include: 

• A store from another processor matches the address range established by the MONITORX 
instruction. 

• The timer expires. 

• Any unmasked interrupt, including INTR, NMI, SMI, INIT. 

• RESET. 

• Any far control transfer that occurs between the MONITORX and the MWAITX. 

EAX specifies optional hints for the MWAITX instruction. Optimized C-state request is 
communicated through EAX[7:4], The processor C-state is EAX[7:4]+1, so to request CO is to place 
the value F in EAX[7:4] and to request C1 is to place the value 0 in EAX[7:4]. All other components of 
EAX should be zero when making the C1 request. Setting a reserved bit in EAX is ignored by the 
processor. 

ECX specifies optional extensions for the MWAITX instruction. The extensions currently defined for 
ECX are: 

• Bit 0: When set, allows interrupts to wake MWAITX, even when eFLAGS.IF = 0. Support for this 
extension is indicated by a feature flag returned by the CPUID instruction. 

• Bit 1: When set, EBX contains the maximum wait time expressed in Software P0 clocks, the same 
clocks counted by the TSC. Setting bit 1 but passing in a value of zero on EBX is equivalent to 
setting bit 1 to a zero. The timer will not be an exit condition. 

• Bit 31-2: When non-zero, results in a #GP(0) exception. 

CPUID Function 0000_0005h indicates support for extended features of MONITORX/MWAITX as 
well as MONITOR/MWAIT: 

• CPUID Fn0000_0005_ECX[EMX] = 1 indicates support for enumeration of 

MONITOR/MWAIT/MONITORX/MWAITX extensions. 

• CPUID Fn0000_0005_ECX[IBE] = 1 indicates that MWAIT/MWAITX can set ECX[0] to allow 
interrupts to cause an exit from the monitor event pending state even when eFLAGS.IF = 0. 

The MWAITX instruction can be executed at any privilege level and MSR 
C001_0015h[MonMwaitUserEn] has no effect on MWAITX. 

Support for the MWAITX instruction is indicated by CPUID Fn0000_0001_ECX[MONITORX] = 1. 
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Software must check the CPUID bit once per program or library initialization before using the 
MWAITX instruction, or inconsistent behavior may result. 

The use of the MWAITX instruction is contingent upon the satisfaction of the following coding 
requirements: 

• MONITORX must precede the MWAITX and occur in the same loop. 

• MWAITX must be conditionally executed only if the awaited store has not already occurred. (This 
prevents a race condition between the MONITORX instruction arming the monitoring hardware 
and the store intended to trigger the monitoring hardware.) 

There is no indication after exiting MWAITX of why the processor exited or if the timer expired. It is 
up to software to check whether the awaiting store has occurred, and if not, determining how much 
time has elapsed if it wants to re-establish the MONITORX with a new timer value. 


Mnemonic 

Opcode 

Description 



Causes the processor to stop 
instruction execution and enter 

MWAITX 

OF 01 FB 

an implementation-dependent 
optimized state until occurrence 
of a class of events 


Related Instructions 

MONITORX, MONITOR, MWAIT 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

MONITORX/MWAITX instructions are not supported, as 
indicated by CPUID Fn0000_0001_ECX[MONITORX] =0 

General protection, 
#GP 

X 

X 

X 

Unsupported extension bits in ECX 
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NEG Two’s Complement Negation 

Performs the two’s complement negation of the value in the specified register or memory location by 
subtracting the value from 0. Use this instruction only on signed integer numbers. 

If the value is 0, the instruction clears the CF flag to 0; otherwise, it sets CF to 1. The OF, SF, ZF, AF, 
and PF flag settings depend on the result of the operation. 

The forms of the NEG instruction that write to memory support the LOCK prefix. For details about the 
LOCK prefix, see “Lock Prefix” on page 11. 


Mnemonic 

Opcode 

Description 

NEG reg/mem8 

F6 13 

Performs a two’s complement negation on an 8-bit 
register or memory operand. 

NEG reg/mem16 

F7 13 

Performs a two’s complement negation on a 16-bit 
register or memory operand. 

NEG reg/mem32 

F7 13 

Performs a two’s complement negation on a 32-bit 
register or memory operand. 

NEG reg/mem64 

F7 13 

Performs a two’s complement negation on a 64-bit 
register or memory operand. 


Related Instructions 

AND, NOT, OR, XOR 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand is in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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NOP No Operation 

Does nothing. This instruction increments the rIP to point to next instruction, but does not affect the 
machine state in any other way. 

The single-byte variant is an alias for XCHG rAX, rAX. 


Mnemonic 

Opcode 

NOP 

90 

NOP reg/mem16 

OF 1F/0 

NOP reg/mem32 

OF 1F/0 

NOP reg/mem64 

OF 1F/0 


Related Instructions 

None 

rFLAGS Affected 

None 


Description 

Performs no operation. 

Performs no operation on a 16-bit register or memory 
operand. 

Performs no operation on a 32-bit register or memory 
operand. 

Performs no operation on a 64-bit register or memory 
operand. 


Exceptions 

None 
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NOT One’s Complement Negation 

Performs the one’s complement negation of the value in the specified register or memory location by 
inverting each bit of the value. 

The memory-operand forms of the NOT instruction support the LOCK prefix. For details about the 
LOCK prefix, see “Lock Prefix” on page 11. 


Mnemonic 

Opcode 

Description 

NOT reg/mem8 

F6 12 

Complements the bits in an 8-bit register or memory 
operand. 

NOT reg/mem16 

F7/2 

Complements the bits in a 16-bit register or memory 
operand. 

NOT reg/mem32 

F7/2 

Complements the bits in a 32-bit register or memory 
operand. 

NOT reg/mem64 

F7/2 

Compliments the bits in a 64-bit register or memory 
operand. 


Related Instructions 

AND, NEG, OR, XOR 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference is performed while alignment 
checking was enabled. 
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OR Logical OR 

Performs a logical or on the bits in a register, memory location, or immediate value (second operand) 
and a register or memory location (first operand) and stores the result in the first operand location. The 
two operands cannot both be memory locations. 

If both corresponding bits are 0, the corresponding bit of the result is 0; otherwise, the corresponding 
result bit is 1. 

The forms of the OR instruction that write to memory support the LOCK prefix. For details about the 


LOCK prefix, see “Lock Prefix” 

on page 11. 


Mnemonic 

Opcode 

Description 

ORAL, imm8 

0C ib 

or the contents of AL with an immediate 8-bit value. 

OR AX, imm16 

0D iw 

or the contents of AX with an immediate 16-bit value. 

OR EAX, imm32 

0D id 

or the contents of EAX with an immediate 32-bit value. 

OR RAX, imm32 

0D id 

or the contents of RAX with a sign-extended immediate 
32-bit value. 

OR reg/mem8, imm8 

80 /I ib 

or the contents of an 8-bit register or memory operand 
and an immediate 8-bit value. 

OR reg/mem16, imm16 

81 /I iw 

or the contents of a 16-bit register or memory operand 
and an immediate 16-bit value. 

OR reg/mem32, imm32 

81 /I id 

or the contents of a 32-bit register or memory operand 
and an immediate 32-bit value. 

OR reg/mem64, imm32 

81 /I id 

or the contents of a 64-bit register or memory operand 
and sign-extended immediate 32-bit value. 

OR reg/mem16, imm8 

83 /I ib 

or the contents of a 16-bit register or memory operand 
and a sign-extended immediate 8-bit value. 

OR reg/mem32, imm8 

83 /I ib 

or the contents of a 32-bit register or memory operand 
and a sign-extended immediate 8-bit value. 

OR reg/mem64, imm8 

83 /I ib 

or the contents of a 64-bit register or memory operand 
and a sign-extended immediate 8-bit value. 

OR reg/mem8, reg8 

08 /r 

or the contents of an 8-bit register or memory operand 
with the contents of an 8-bit register. 

OR reg/mem16, reg16 

09 /r 

or the contents of a 16-bit register or memory operand 
with the contents of a 16-bit register. 

OR reg/mem32, reg32 

09 /r 

or the contents of a 32-bit register or memory operand 
with the contents of a 32-bit register. 

OR reg/mem64, reg64 

09 /r 

or the contents of a 64-bit register or memory operand 
with the contents of a 64-bit register. 
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Mnemonic 

Opcode 

Description 

OR reg8, reg/mem8 

OA /r 

or the contents of an 8-bit register with the contents of 
an 8-bit register or memory operand. 

OR reg 16, reg/mem 16 

OB /r 

or the contents of a 16-bit register with the contents of 
a 16-bit register or memory operand. 

OR reg32 , reg/mem32 

OB /r 

or the contents of a 32-bit register with the contents of 
a 32-bit register or memory operand. 

OR reg64, reg/mem64 

OB /r 

or the contents of a 64-bit register with the contents of 
a 64-bit register or memory operand. 


The following chart summarizes the effect of this instruction: 


X 

Y 

X or Y 

0 

0 

0 

0 

1 

1 

1 

0 

1 

1 

1 

1 


Related Instructions 

AND, NEG, NOT, XOR 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

M 

0 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 
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Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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OUT Output to Port 

Copies the value from the AL, AX, or EAX register (second operand) to an I/O port (first operand). 
The port address can be a byte-immediate value (OOh to FFh) or the value in the DX register (OOOOh to 
FFFFh). The source register used determines the size of the port (8, 16, or 32 bits). 

If the operand size is 64 bits, OUT only writes to a 32-bit I/O port. 

If the CPL is higher than the IOPL or the mode is virtual mode, OUT checks the I/O pennission bitmap 
in the TSS before allowing access to the I/O port. See Volume 2 for details on the TSS I/O permission 
bitmap. 


Mnemonic 

Opcode 

OUT imm8, AL 

E6 ib 

OUT imm8, AX 

E7 ib 

OUT imm8, EAX 

E7 ib 

OUT DX, AL 

EE 

OUT DX, AX 

EF 

OUT DX, EAX 

EF 


Description 

Output the byte in the AL register to the port specified by 
an 8-bit immediate value. 

Output the word in the AX register to the port specified 
by an 8-bit immediate value. 

Output the doubleword in the EAX register to the port 
specified by an 8-bit immediate value. 

Output byte in AL to the output port specified in DX. 

Output word in AX to the output port specified in DX. 

Output doubleword in EAX to the output port specified in 
DX. 


Related Instructions 

IN, INSx, OUTSx 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 
#GP 


X 


One or more I/O permission bits were set in the TSS for the 
accessed port. 



X 

The CPL was greater than the IOPL and one or more I/O 
permission bits were set in the TSS for the accessed port. 

Page fault (#PF) 


X 

X 

A page fault resulted from the execution of the instruction. 
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OUTS Output String 

OUTSB 

OUTSW 

OUTSD 

Copies data from the memory location pointed to by DS:rSI to the I/O port address (OOOOh to FFFFh) 
specified in the DX register, and then increments or decrements the rSI register according to the setting 
of the DF flag in the rFLAGS register. 

If the DF flag is 0, the instruction increments rSI; otherwise, it decrements rSI. It increments or 
decrements the pointer by 1,2, or 4, depending on the size of the value being copied. 

The OUTS DX mnemonic uses an explicit memory operand (second operand) to determine the type 
(size) of the value being copied, but always uses DS:rSI for the location of the value to copy. The 
explicit register operand (first operand) specifies the I/O port address and must always be DX. 

The no-operands forms of the mnemonic use the DS:rSI register pair to point to the memory data to be 
copied and the contents of the DX register as the destination I/O port address. The mnemonic specifies 
the size of the I/O port and the type (size) of the value being copied. 

The OUTSx instruction supports the REP prefix. For details about the REP prefix, see “Repeat 
Prefixes” on page 12. 

If the effective operand size is 64-bits, the instruction behaves as if the operand size were 32 bits. 

If the CPL is higher than the IOPL or the mode is virtual mode, OUTSx checks the I/O permission 
bitmap in the TSS before allowing access to the I/O port. See Volume 2 for details on the TSS I/O 
permission bitmap. 


Mnemonic 

Opcode 

Description 

OUTS DX, mem8 

6E 

Output the byte in DS:rSI to the port specified in DX, 
then increment or decrement rSI. 

OUTS DX, mem 16 

6F 

Output the word in DS:rSI to the port specified in DX, 
then increment or decrement rSI. 

OUTS DX, mem32 

6F 

Output the doubleword in DS:rSI to the port specified in 
DX, then increment or decrement rSI. 

OUTSB 

6E 

Output the byte in DS:rSI to the port specified in DX, 
then increment or decrement rSI. 

OUTSW 

6F 

Output the word in DS:rSI to the port specified in DX, 
then increment or decrement rSI. 

OUTSD 

6F 

Output the doubleword in DS:rSI to the port specified in 
DX, then increment or decrement rSI. 
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Related Instructions 

IN, INSx, OUT 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 


X 


One or more I/O permission bits were set in the TSS for the 
accessed port. 



X 

The CPL was greater than the IOPL and one or more I/O 
permission bits were set in the TSS for the accessed port. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference is performed while alignment 
checking was enabled. 
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PAUSE Pause 

Improves the performance of spin loops, by providing a hint to the processor that the current code is in 
a spin loop. The processor may use this to optimize power consumption while in the spin loop. 

Architecturally, this instruction behaves like a NOP instruction. 

Processors that do not support PAUSE treat this opcode as a NOP instruction. 


Description 

Provides a hint to processor that a spin loop is being 
executed. 

Related Instructions 

None 

rFLAGS Affected 

None 

Exceptions 

None 


Mnemonic Opcode 

PAUSE F390 
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PDEP Parallel Deposit Bits 

Scatters consecutive bits of the first source operand, starting at the least significant bit, to bit positions 
in the destination as specified by 1 bits in the second source operand {mask). Bit positions in the 
destination corresponding to 0 bits in the mask are cleared. 

This instruction has three operands: 

PDEP dest, src, mask 

The following diagram illustrates the operation of this instruction. 
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If the mask is all ones, the execution of this instruction effectively copies the source to the destination. 

In 64-bit mode, the operand size is determined by the value of VEX. W. If VEX. W is 1, the operand 
size is 64 bits; if VEX. W is 0, the operand size is 32 bits. In 32-bit mode, VEX.W is ignored. 16-bit 
operands are not supported. 

The destination (dest) and the source {src) are general-purpose registers. The second source operand 
{mask) is either a general-purpose register or a memory operand. 

This instruction is a BMI2 instruction. Support for this instruction is indicated by CPUID 
Fn0000_0007_EBX_x0[BMI2] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



VEX 

RXB.map_select 

W.vvvv.L.pp 

Opcode 

PDEP reg32 , reg32, reg/mem32 

C4 

RXB.02 

O.src.0.11 

F5 /r 

PDEP reg64, reg64, reg/mem64 

C4 

RXB.02 

l.src.0.11 

F5 /r 
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Related Instructions 


rFLAGS Affected 

None. 


Exceptions 


Exception 

Mode 

Cause of Exception 

Real 

Virtual 

8086 

Protected 

Invalid opcode, #UD 

X 

X 


BMI2 instructions are only recognized in protected mode. 



X 

BMI2 instructions are not supported, as indicated by 
CPUID Fn0000_0007_EBX_x0[BMI2] = 0. 



X 

VEX.Lis 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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PEXT Parallel Extract Bits 

Copies bits from the source operand, based on a mask, and packs them into the low-order bits of the 
destination. Clears all bits in the destination to the left of the most-significant bit copied. 

This instruction has three operands: 

PEXT dest, src, mask 

The following diagram illustrates the operation of this instruction. 
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If the mask is all ones, the execution of this instruction effectively copies the source to the destination. 

In 64-bit mode, the operand size is determined by the value of VEX. W. If VEX. W is 1, the operand 
size is 64 bits; if VEX. W is 0, the operand size is 32 bits. In 32-bit mode, VEX.W is ignored. 16-bit 
operands are not supported. 

The destination (dest) and the source (src) are general-purpose registers. The second source operand 
(mask) is either a general-purpose register or a memory operand. 

This instruction is a BMI2 instruction. Support for this instruction is indicated by CPUID 
Fn0000_0007_EBX_x0[BMI2] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



VEX 

RXB.map_select 

W.vvvv.L.pp 

Opcode 

PEXT reg32, reg32, reg/mem32 

C4 

RXB.02 

O.src.0.10 

F5 /r 

PEXT reg64, reg64, reg/mem64 

C4 

RXB.02 

l.src.0.10 

F5 It 
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Related Instructions 


rFLAGS Affected 

None. 


Exceptions 


Exception 

Mode 

Cause of Exception 

Real 

Virtual 

8086 

Protected 

Invalid opcode, #UD 

X 

X 


BMI2 instructions are only recognized in protected mode. 



X 

BMI2 instructions are not supported, as indicated by 
CPUID Fn0000_0007_EBX_x0[BMI2] = 0. 



X 

VEX.Lis 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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POP Pop Stack 

Copies the value pointed to by the stack pointer (SS:rSP) to the specified register or memory location 
and then increments the rSP by 2 for a 16-bit pop, 4 for a 32-bit pop, or 8 for a 64-bit pop. 

The operand-size attribute determines the amount by which the stack pointer is incremented (2, 4 or 8 
bytes). The stack-size attribute determines whether SP, ESP, or RSP is incremented. 

For forms of the instruction that load a segment register (POP DS, POP ES, POP FS, POP GS, POP 
SS), the source operand must be a valid segment selector. When a segment selector is popped into a 
segment register, the processor also loads all associated descriptor information into the hidden part of 
the register and validates it. 

It is possible to pop a null segment selector value (0000-0003h) into the DS, ES, FS, or GS register. 
This action does not cause a general protection fault, but a subsequent reference to such a segment 
does cause a #GP exception. For more information about segment selectors, see "Segment Selectors 
and Registers" in Volume 2: System Programming. 

In 64-bit mode, the POP operand size defaults to 64 bits and there is no prefix available to encode a 32- 
bit operand size. Using POP DS, POP ES, or POP SS instruction in 64-bit mode generates an invalid- 
opcode exception. 

This instruction cannot pop a value into the CS register. The RET (Far) instruction performs this 
function. 


Mnemonic 

Opcode 

Description 

POP reg 1 mem 16 

8F/0 

Pop the top of the stack into a 16-bit register or memory 
location. 

POP reg lmem32 

8F/0 

Pop the top of the stack into a 32-bit register or memory 
location. 

(No prefix for encoding this in 64-bit mode.) 

POP reg Imem64 

8F/0 

Pop the top of the stack into a 64-bit register or memory 
location. 

POP reg16 

58 +rw 

Pop the top of the stack into a 16-bit register. 

POP reg32 

58 +rd 

Pop the top of the stack into a 32-bit register. 

(No prefix for encoding this in 64-bit mode.) 

POP reg64 

58 +rq 

Pop the top of the stack into a 64-bit register. 

POP DS 

IF 

Pop the top of the stack into the DS register. 

(Invalid in 64-bit mode.) 

POP ES 

07 

Pop the top of the stack into the ES register. 

(Invalid in 64-bit mode.) 

POP SS 

17 

Pop the top of the stack into the SS register. 

(Invalid in 64-bit mode.) 
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Description 

Pop the top of the stack into the FS register. 
Pop the top of the stack into the GS register. 

Related Instructions 

PUSH 

rFLAGS Affected 

None 

Exceptions 


Mnemonic 

Opcode 

POP FS 

OF A1 

POP GS 

OF A9 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 



X 

POP DS, POP ES, or POP SS was executed in 64-bit mode. 

Segment not 
present, #NP 
(selector) 



X 

The DS, ES, FS, or GS register was loaded with a non-null 
segment selector and the segment was marked not present. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

Stack, #SS 
(selector) 



X 

The SS register was loaded with a non-null segment selector 
and the segment was marked not present. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 




X 

A null data segment was used to reference memory. 




X 

A segment register was loaded and the segment descriptor 
exceeded the descriptor table limit. 




X 

A segment register was loaded and the segment selector’s Tl 
bit was set, but the LDT selector was a null selector. 




X 

The SS register was loaded with a null segment selector in 
non-64-bit mode or while CPL = 3. 

General protection, 
#GP 

(selector) 



X 

The SS register was loaded and the segment selector RPL 
and the segment descriptor DPL were not equal to the CPL. 



X 

The SS register was loaded and the segment pointed to was 
not a writable data segment. 




X 

The DS, ES, FS, or GS register was loaded and the segment 
pointed to was a data or non-conforming code segment, but 
the RPL or the CPL was greater than the DPL. 




X 

The DS, ES, FS, or GS register was loaded and the segment 
pointed to was not a data segment or readable code segment. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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POPA POP All GPRs 

POPAD 

Pops words or doublewords from the stack into the general-purpose registers in the following order: 
eDI, eSI, eBP, eSP (image is popped and discarded), eBX, eDX, eCX, and eAX. The instruction 
increments the stack pointer by 16 or 32, depending on the operand size. 

Using the POPA or POPAD instructions in 64-bit mode generates an invalid-opcode exception. 


Mnemonic 

Opcode 

Description 


POPA 

61 

Pop the Dl, SI, BP, SP, BX, DX, CX, and 
(Invalid in 64-bit mode.) 

AX registers. 



Pop the EDI, ESI, EBP, ESP, EBX, EDX, 

ECX, and EAX 

POPAD 

61 

registers. 

(Invalid in 64-bit mode.) 



Related Instructions 

PUSHA, PUSHAD 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode 
(#UD) 



X 

This instruction was executed in 64-bit mode. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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POPCNT Bit Population Count 

Counts the number of bits having a value of 1 in the source operand and places the result in the 
destination register. The source operand is a 16-, 32-, or 64-bit general purpose register or memory 
operand; the destination operand is a general purpose register of the same size as the source operand 
register. 

If the input operand is zero, the ZF flag is set to 1 and zero is written to the destination register. 
Otherwise, the ZF flag is cleared. The other flags are cleared. 

Support for the POPCNT instruction is indicated by CPUID Fn0000_0001_ECX[POPCNT] = 1. 
Software MUST check the CPUID bit once per program or library initialization before using the 
POPCNT instruction, or inconsistent behavior may result. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic 


Opcode 

Description 

POPCNT 

reg16, reg/mem 16 

F3 OF B8 /r 

Count the Is in reg/mem16 

POPCNT 

reg32, reg/mem32 

F3 OF B8 /r 

Count the Is in reg/mem32 

POPCNT 

reg64, reg/mem64 

F3 OF B8 /r 

Count the Is in reg/mem64 


Related Instructions 

BSF, BSR, LZCNT 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




0 

M 

0 

0 

0 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

The POPCNT instruction is not supported, as indicated by 
CPUID Fn0000_0001_ECX[POPCNT]. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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POPF POP to rFLAGS 

POPFD 

POPFQ 

Pops a word, doubleword, or quadword from the stack into the rFLAGS register and then increments 
the stack pointer by 2, 4, or 8, depending on the operand size. 

In protected or real mode, all the non-reserved flags in the rFLAGS register can be modified, except 
the VIP, VIF, and VM flags, which are unchanged. In protected mode, at a privilege level greater than 
0 the IOPL is also unchanged. The instruction alters the interrupt flag (IF) only when the CPL is less 
than or equal to the IOPL. 

In virtual-8086 mode, if IOPL field is less than 3, attempting to execute a POPFx or PUSHFx 
instruction while VME is not enabled, or the operand size is not 16-bit, generates a #GP exception. 

In 64-bit mode, this instruction defaults to a 64-bit operand size; there is no prefix available to encode 
a 32-bit operand size. 


Mnemonic 

Opcode 

Description 

POPF 

9D 

Pop a word from the stack into the FLAGS register. 

POPFD 

9D 

Pop a double word from the stack into the EFLAGS 
register. (No prefix for encoding this in 64-bit mode.) 

POPFQ 

9D 

Pop a quadword from the stack to the RFLAGS register. 

Action 




// See "Pseudocode Definition" on page 57. 


POPF START: 


IF (REAL_MODE) 
POPF_REAL 

ELSIF (PROTECTED_MODE) 
POPF_PROTECTED 
ELSE // (VIRTUAL_MODE) 
POPF VIRTUAL 


POPF_REAL: 

POP.v temp_RFLAGS 
RFLAGS.v = temp_RFLAGS 

EXIT 


// VIF,VIP,VM unchanged 
// RF cleared 
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POPF_PROTECTED: 

POP.v temp_RFLAGS 

RFLAGS.v = temp_RFLAGS // VIF,VIP,VM unchanged 

// IOPL changed only if (CPL==0) 

// IF changed only if (CPL<=old_RFLAGS.IOPL) 
// RF cleared 

EXIT 


POPF VIRTUAL: 


IF (RFLAGS.IOPL==3) 

{ 

POP.v temp_RFLAGS 

RFLAGS.v = temp_RFLAGS // VIF,VIP,VM,IOPL unchanged 

// RF cleared 

EXIT 


} 

ELSIF ( (CR4.VME==1) && (OPERANDJ31ZE==16) ) 

{ 


deliver 


POP.w temp_RFLAGS 

IF (((temp_RFLAGS.IF==1) && (RFLAGS.VIP==l)) || (temp_RFLAGS.TF==1)) 

EXCEPTION [#GP(0)] 

// notify the virtual-mode-manager to 

// the task's pending interrupts 
RFLAGS.w = temp_RFLAGS // IF,IOPL unchanged 

// RFLAGS.VIF=temp_RFLAGS.IF 
// RF cleared 

EXIT 


ELSE // ( (RFLAGS.I0PL<3) && ( (CR4.VME==0) || (0PERANDJ31ZE!=16)) ) 

EXCEPTION [#GP(0)] 


Related Instructions 

PUSHF, PUSHFD, PUSHFQ 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 

M 


M 

M 


0 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 1 5, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 


X 


The I/O privilege level was less than 3 and one of the following 
conditions was true: 

• CR4.VME was 0. 

• The effective operand size was 32-bit. 

• Both the original EFLAGS.VIP and the new EFLAGS.IF bits 
were set. 

• The new EFLAGS.TF bit was set. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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PREFETCH Prefetch LI Data-Cache Line 

PREFETCHW 

Loads the entire 64-byte aligned memory sequence containing the specified memory address into the 
LI data cache. The position of the specified memory address within the 64-byte cache line is 
irrelevant. If a cache hit occurs, or if a memory fault is detected, no bus cycle is initiated and the 
instruction is treated as a NOP. 

The PREFETCHW instruction loads the prefetched line and sets the cache-line state to Modified, in 
anticipation of subsequent data writes to the line. The PREFETCH instruction, by contrast, typically 
sets the cache-line state to Exclusive (depending on the hardware implementation). 

The opcodes for the PREFETCH/PREFETCHW instructions include the ModRM byte; however, only 
the memory form of ModRM is valid. The register form of ModRM causes an invalid-opcode 
exception. Because there is no destination register, the three destination register field bits of the 
ModRM byte define the type of prefetch to be performed. The bit patterns 000b and 001b define the 
PREFETCH and PREFETCHW instructions, respectively. All other bit patterns are reserved for future 
use. 

The reserved PREFETCH types do not result in an invalid-opcode exception if executed. Instead, for 
forward compatibility with future processors that may implement additional forms of the PREFETCH 
instruction, all reserved PREFETCH types are implemented as synonyms of the basic PREFETCH 
type (the PREFETCH instruction with type 000b). 

The operation of these instructions is implementation-dependent. The processor implementation can 
ignore or change these instructions. The size of the cache line also depends on the implementation, 
with a minimum size of 32 bytes. For details on the use of this instruction, see the processor data sheets 
or other software-optimization documentation relating to particular hardware implementations. 

When paging is enabled and PREFETCHW performs a prefetch from a writable page, it may set the 
PTE Dirty bit to 1. 

Support for the PREFETCH and PREFETCHW instructions is indicated by CPUID 
Fn8000_0001_ECX[3DNowPrefetch] OR Fn8 000 000 1 _EDX [LM] OR 
Fn8000_000 l_EDX[3DNow] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic 

PREFETCH mem8 

PREFETCHW mem8 


Opcode 

OF 0D 10 

OF 0D/1 


Description 

Prefetch processor cache line into LI data cache. 

Prefetch processor cache line into LI data cache and 
mark it modified. 
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Related Instructions 

PREFETCH/eve/ 

rFLAGS Affected 

None 


Exceptions 


Exception (vector) 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

PREFETCH and PREFETCHW instructions are not 
supported, as indicated by CPUID 

Fn8000 0001 ECX[3DNowPrefetch] AND 

Fn8000 0001 EDX[LM] AND 
Fn8000_0001_EDX[3DNow] = 0. 

X 

X 

X 

The operand was a register. 
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PREFETCH/eve/ Prefetch Data to Cache Level level 

Loads a cache line from the specified memory address into the data-cache level specified by the 
locality reference bits 5:3 of the ModRM byte. Table 3-3 on page 278 lists the locality reference 
options for the instruction. 

This instruction loads a cache line even if the mem.8 address is not aligned with the start of the line. If 
the cache line is already contained in a cache level that is lower than the specified locality reference, or 
if a memory fault is detected, a bus cycle is not initiated and the instruction is treated as a NOP. 

The operation of this instruction is implementation-dependent. The processor implementation can 
ignore or change this instruction. The size of the cache line also depends on the implementation, with a 
minimum size of 32 bytes. AMD processors alias PREFETCH1 and PREFETCH2 to PREFETCHO. 
For details on the use of this instruction, see the software-optimization documentation relating to 
particular hardware implementations. 


Mnemonic 

Opcode 

PREFETCHNTA mem8 

OF 18/0 

PREFETCHTO mem8 

OF 18/1 

PREFETCHT1 mem8 

OF 18/2 

PREFETCHT2 mem8 

OF 18/3 


Description 

Move data closer to the processor using the NTA 
reference. 

Move data closer to the processor using the TO 
reference. 

Move data closer to the processor using the T1 
reference. 

Move data closer to the processor using the T2 
reference. 


Table 3-3. Locality References for the Prefetch Instructions 


Locality 

Reference 

Description 

NTA 

Non-Temporal Access—Move the specified data into the processor with 
minimum cache pollution. This is intended for data that will be used only 
once, rather than repeatedly. The specific technique for minimizing cache 
pollution is implementation-dependent and may include such techniques 
as allocating space in a software-invisible buffer, allocating a cache line in 
only a single way, etc. For details, see the software-optimization 
documentation for a particular hardware implementation. 

TO 

All Cache Levels—Move the specified data into all cache levels. 

T1 

Level 2 and Higher—Move the specified data into all cache levels except 
0th level (LI) cache. 

T2 

Level 3 and Higher—Move the specified data into all cache levels except 
0th level (LI) and 1st level (L2) caches. 


Related Instructions 

PREFETCH, PREFETCHW 
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rFLAGS Affected 

None 

Exceptions 

None 


General-Purpose 
Instruction Reference 


PREFETCHIevel 


279 



AMpg 

AMD64 Technology 


24594 — Rev. 3.28—September 2019 


PUSH Push onto Stack 

Decrements the stack pointer and then copies the specified immediate value or the value in the 
specified register or memory location to the top of the stack (the memory location pointed to by 
SS:rSP). 

The operand-size attribute detennines the number of bytes pushed to the stack. The stack-size attribute 
determines whether SP, ESP, or RSP is the stack pointer. The address-size attribute is used only to 
locate the memory operand when pushing a memory operand to the stack. 

If the instruction pushes the stack pointer (rSP), the resulting value on the stack is that of rSP before 
execution of the instruction. 

There is a PUSH CS instruction but no corresponding POP CS. The RET (Far) instruction pops a value 
from the top of stack into the CS register as part of its operation. 

In 64-bit mode, the operand size of all PUSH instructions defaults to 64 bits, and there is no prefix 
available to encode a 32-bit operand size. Using the PUSH CS, PUSH DS, PUSH ES, or PUSH SS 
instructions in 64-bit mode generates an invalid-opcode exception. 

Pushing an odd number of 16-bit operands when the stack address-size attribute is 32 results in a 
misaligned stack pointer. 


Mnemonic Opcode 


PUSH reg Imem16 

FF/6 

PUSH reg Imem32 

FF/6 

PUSH reg Imem64 

FF/6 

PUSH reg 16 

50 +rw 

PUSH reg32 

50 +rd 

PUSH reg64 

50 +rq 

PUSH imm8 

6A ib 

PUSH imm16 

68 iw 

PUSH imm32 

68 id 

PUSH imm64 

68 id 

PUSH CS 

0E 


Description 

Push the contents of a 16-bit register or memory 
operand onto the stack. 

Push the contents of a 32-bit register or memory 
operand onto the stack. (No prefix for encoding this in 
64-bit mode.) 

Push the contents of a 64-bit register or memory 
operand onto the stack. 

Push the contents of a 16-bit register onto the stack. 

Push the contents of a 32-bit register onto the stack. (No 
prefix for encoding this in 64-bit mode.) 

Push the contents of a 64-bit register onto the stack. 

Push an 8-bit immediate value (sign-extended to 16, 32, 
or 64 bits) onto the stack. 

Push a 16-bit immediate value onto the stack. 

Push a 32-bit immediate value onto the stack. (No prefix 
for encoding this in 64-bit mode.) 

Push a sign-extended 32-bit immediate value onto the 
stack. 

Push the CS selector onto the stack. (Invalid in 64-bit 
mode.) 
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Opcode 

16 

IE 
06 

OF AO 
OF A8 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 



X 

PUSH CS, PUSH DS, PUSH ES, or PUSH SS was executed 
in 64-bit mode. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 


Mnemonic 

PUSH SS 

PUSH DS 

PUSH ES 

PUSH FS 
PUSH GS 

Related Instructions 

POP 

rFLAGS Affected 

None 

Exceptions 


AMD64 Technology 


Description 

Push the SS selector onto the stack. (Invalid in 64-bit 
mode.) 

Push the DS selector onto the stack. (Invalid in 64-bit 
mode.) 

Push the ES selector onto the stack. (Invalid in 64-bit 
mode.) 

Push the FS selector onto the stack. 

Push the GS selector onto the stack. 
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PUSHA Push All GPRs onto Stack 

PUSHAD 

Pushes the contents of the eAX, eCX, eDX, eBX, eSP (original value), eBP, eSI, and eDI general- 
purpose registers onto the stack in that order. This instruction decrements the stack pointer by 16 or 32 
depending on operand size. 

Using the PUSHA or PUSHAD instruction in 64-bit mode generates an invalid-opcode exception. 


Mnemonic 

Opcode 

Description 

PUSHA 

60 

Push the contents of the AX, CX, DX, BX, original SP, 
BP, SI, and Dl registers onto the stack. 

(Invalid in 64-bit mode.) 

PUSHAD 

60 

Push the contents of the EAX, ECX, EDX, EBX, original 
ESP, EBP, ESI, and EDI registers onto the stack. 

(Invalid in 64-bit mode.) 


Related Instructions 

POPA, POPAD 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 



X 

This instruction was executed in 64-bit mode. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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PUSHF Push rFLAGS onto Stack 

PUSHFD 

PUSHFQ 

Decrements the rSP register and copies the rFLAGS register (except for the VM and RF flags) onto the 
stack. The instruction clears the VM and RF flags in the rFLAGS image before putting it on the stack. 

The instruction pushes 2, 4, or 8 bytes, depending on the operand size. 

In 64-bit mode, this instruction defaults to a 64-bit operand size and there is no prefix available to 
encode a 32-bit operand size. 

In virtual-8086 mode, if system software has set the IOPL field to a value less than 3, a general- 
protection exception occurs if application software attempts to execute PUSHFx or POPFx while 
VME is not enabled or the operand size is not 16-bit. 


Mnemonic Opcode 


PUSHF 

9C 

PUSHFD 

9C 

PUSHFQ 

9C 


Action 

// See "Pseudocode Definition" on 


Description 

Push the FLAGS word onto the stack. 

Push the EFLAGS doubleword onto stack. (No prefix 
encoding this in 64-bit mode.) 

Push the RFLAGS quadword onto stack. 

page 57. 


PUSHF_START: 

IF (REAL_MODE) 

PUSHF_REAL 

ELSIF (PROTECTED_MODE) 

PUSHF_PROTECTED 
ELSE // (VIRTUAL_MODE) 

PUSHF_VIRTUAL 

PUSHF_REAL: 

PUSH.v old RFLAGS // Pushed with RF and VM cleared. 

EXIT 

PUSHF_PROTECTED: 

PUSH.v old RFLAGS // Pushed with RF cleared. 

EXIT 

PUSHF_VIRTUAL: 

IF (RFLAGS.IOPL==3) 

{ 

PUSH.v old_RFLAGS // Pushed with RF,VM cleared. 

EXIT 

} 
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ELSIF ( (CR4.VME==1) && (0PERANDJ31ZE==16) ) 

{ 

PUSH.v old RFLAGS // Pushed with VIF in the IF position. 

// Pushed with I0PL=3. 

EXIT 

} 

ELSE // ( (RFLAGS . I0PL<3) && ( (CR4 . VME==0 ) || (OPERANDS IZE ! =16)) ) 

EXCEPTION [#GP (0) ] 


Related Instructions 

POPF, POPFD, POPFQ 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 


X 


The I/O privilege level was less than 3 and either VME was not 
enabled or the operand size was not 16-bit. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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RCL Rotate Through Carry Left 

Rotates the bits of a register or memory location (first operand) to the left (more significant bit 
positions) and through the carry flag by the number of bit positions in an unsigned immediate value or 
the CL register (second operand). The bits rotated through the carry flag are rotated back in at the right 
end (lsb) of the first operand location. 

The processor masks the upper three bits of the count operand, thus restricting the count to a number 
between 0 and 31. When the destination is 64 bits wide, the processor masks the upper two bits of the 
count, providing a count in the range of 0 to 63. 

For 1-bit rotates, the instruction sets the OF flag to the logical xor of the CF bit (after the rotate) and 
the most significant bit of the result. When the rotate count is greater than 1, the OF flag is undefined. 
When the rotate count is 0, no flags are affected. 


Mnemonic 

Opcode 

Description 

RCL reg/mem8, 1 

DO/2 

Rotate the 9 bits consisting of the carry flag and an 8-bit 
register or memory location left 1 bit. 

RCL reg/mem8, CL 

D2/2 

Rotate the 9 bits consisting of the carry flag and an 8-bit 
register or memory location left the number of bits 
specified in the CL register. 

RCL reg/mem8, imm8 

CO 12 ib 

Rotate the 9 bits consisting of the carry flag and an 8-bit 
register or memory location left the number of bits 
specified by an 8-bit immediate value. 

RCL reg/mem16, 1 

D1 12 

Rotate the 17 bits consisting of the carry flag and a 16- 
bit register or memory location left 1 bit. 

RCL reg/mem16, CL 

D3/2 

Rotate the 17 bits consisting of the carry flag and a 16- 
bit register or memory location left the number of bits 
specified in the CL register. 

RCL reg/mem16, imm8 

Cl 12 ib 

Rotate the 17 bits consisting of the carry flag and a 16- 
bit register or memory location left the number of bits 
specified by an 8-bit immediate value. 

RCL reg/mem32, 1 

D1 12 

Rotate the 33 bits consisting of the carry flag and a 32- 
bit register or memory location left 1 bit. 

RCL reg/mem32, CL 

D3/2 

Rotate 33 bits consisting of the carry flag and a 32-bit 
register or memory location left the number of bits 
specified in the CL register. 

RCL reg/mem32, imm8 

Cl 12 ib 

Rotate the 33 bits consisting of the carry flag and a 32- 
bit register or memory location left the number of bits 
specified by an 8-bit immediate value. 

RCL reg/mem64, 1 

D1 12 

Rotate the 65 bits consisting of the carry flag and a 64- 
bit register or memory location left 1 bit. 
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Mnemonic 


RCL reg/mem64, CL 

RCL reg/m em 64, imm8 


Opcode Description 

Rotate the 65 bits consisting of the carry flag and a 64- 
D3 12 bit register or memory location left the number of bits 

specified in the CL register. 

Rotates the 65 bits consisting of the carry flag and a 64- 
C1 12 ib bit register or memory location left the number of bits 

specified by an 8-bit immediate value. 


Related Instructions 

RCR, ROL, ROR 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 








M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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RCR Rotate Through Carry Right 

Rotates the bits of a register or memory location (first operand) to the right (toward the less significant 
bit positions) and through the carry flag by the number of bit positions in an unsigned immediate value 
or the CL register (second operand). The bits rotated through the carry flag are rotated back in at the 
left end (msb) of the first operand location. 

The processor masks the upper three bits in the count operand, thus restricting the count to a number 
between 0 and 31. When the destination is 64 bits wide, the processor masks the upper two bits of the 
count, providing a count in the range of 0 to 63. 

For 1-bit rotates, the instruction sets the OF flag to the logical xor of the two most significant bits of 
the result. When the rotate count is greater than 1, the OF flag is undefined. When the rotate count is 0, 
no flags are affected. 


Mnemonic 

Opcode 

Description 

RCR reg/mem8, 1 

DO/3 

Rotate the 9 bits consisting of the carry flag and an 8-bit 
register or memory location right 1 bit. 

RCR reg/mem8, CL 

D2/3 

Rotate the 9 bits consisting of the carry flag and an 8-bit 
register or memory location right the number of bits 
specified in the CL register. 

RCR reg/mem8,imm8 

CO 13 ib 

Rotate the 9 bits consisting of the carry flag and an 8-bit 
register or memory location right the number of bits 
specified by an 8-bit immediate value. 

RCR reg/mem16, 1 

D1 13 

Rotate the 17 bits consisting of the carry flag and a 16- 
bit register or memory location right 1 bit. 

RCR reg/mem16, CL 

D3/3 

Rotate the17 bits consisting of the carry flag and a 16-bit 
register or memory location right the number of bits 
specified in the CL register. 

RCR reg/mem16, imm8 

Cl 13 ib 

Rotate the 17 bits consisting of the carry flag and a 16- 
bit register or memory location right the number of bits 
specified by an 8-bit immediate value. 

RCR reg/mem32, 1 

D1 13 

Rotate the 33 bits consisting of the carry flag and a 32- 
bit register or memory location right 1 bit. 

RCR reg/mem32, CL 

D3/3 

Rotate 33 bits consisting of the carry flag and a 32-bit 
register or memory location right the number of bits 
specified in the CL register. 

RCR reg/mem32, imm8 

Cl 13 ib 

Rotate the 33 bits consisting of the carry flag and a 32- 
bit register or memory location right the number of bits 
specified by an 8-bit immediate value. 

RCR reg/mem64, 1 

D1 13 

Rotate the 65 bits consisting of the carry flag and a 64- 
bit register or memory location right 1 bit. 
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Mnemonic 


RCR reg/mem64 ,CL 


RCR reg/mem64, imm8 


Opcode Description 

Rotate 65 bits consisting of the carry flag and a 64-bit 
D3 13 register or memory location right the number of bits 

specified in the CL register. 

Rotate the 65 bits consisting of the carry flag and a 64- 
C1 13 ib bit register or memory location right the number of bits 

specified by an 8-bit immediate value. 


Related Instructions 

RCL, ROR, ROL 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 








M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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RDFSBASE Read FS.base 

RDGSBASE Read GS.base 

Copies the base field of the FS or GS segment descriptor to the specified register. When supported and 
enabled, these instructions can be executed at any processor privilege level. The RDFSBASE and 
RDGSBASE instructions are only defined in 64-bit mode. 

System software must set the FSGSBASE bit (bit 16) of CR4 to enable the RDFSBASE and 
RDGSBASE instructions. 

Support for this instruction is indicated by CPUID Fn0000_0007_EBX_x0[FSGSBASE] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic 

Opcode 

RDFSBASE reg32 

F3 OF AE 10 

RDFSBASE reg64 

F3 OF AE 10 

RDGSBASE reg32 

F3 OF AE /I 

RDGSBASE reg64 

F3 OF AE /I 


Description 

Copy the lower 32 bits of FS.base to the specified 
general-purpose register. 

Copy the entire 64-bit contents of FS.base to the 
specified general-purpose register. 

Copy the lower 32 bits of GS.base to the specified 
general-purpose register. 

Copy the entire 64-bit contents of GS.base to the 
specified general-purpose register. 


Related Instructions 

WRFSBASE, WRGSBASE 

rFLAGS Affected 

None. 

Exceptions 


Exception 

Legacy 

Compat¬ 

ibility 

64-bit 

Cause of Exception 

#UD 

X 

X 


Instruction is not valid in compatibility or legacy 
modes. 



X 

Instruction not supported as indicated by CPUID 
Fn0000_0007_EBX_x0[FSGSBASE] = 0 or, if 
supported, not enabled in CR4. 
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RDPID Read Processor ID 

RDPID reads the value of TSC AUX MSR used by the RDTSCP instruction into the specified 
destination register. Normal operand size prefixes do not apply and the update is either 32 bit or 64 bit 
based on the current mode. 

The RDPID instruction can be used to access the TSC AUX value at CPL > 0 in cases where the 
operating system has disabled unprivileged execution of the RDTSCP instruction. 

The content of the TSC AUX MSR, including how and even whether it actually indicates a processor 
ID, is a matter of operating system convention. 

The RDPID instruction is supported if the feature flag CPUID Fn0000_0007 ECX[22]=1. 


Mnemonic 

RDPID 


Opcode Description 

F3 OF C7/7 Read TSC AUX 


Related Instructions 

RDTSCP 

rFLAGS Affected 

rNone 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 

















M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

Instruction not supported by CPUID FnOOOO 0007 ECX[22] = 

0 . 
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RDPRU Read Processor Register 

RDPRU instruction is used to give access to some processor registers that are typically only accessible 
when the privilege level is zero. ECX is used as the implicit register to specify which register to read. 
RDPRU places the specified register’s value into EDX:EAX. 

The RDPRU instruction normally can be executed at any privilege level. When CR4.TSD=1, RDPRU 
can only be used when the privilege level is zero. When the CPL>0 with CR4.TSD=1, the RDPRU 
instruction will generate a #UD fault. 

The RDPRU instruction is supported if the feature flag CPUID Fn8000_0008 EBX[4]=1. The 16-bit 
field in CPUID Fn 8000 0008 EDX[31:16] returns the largest ECX value that returns a valid register. 
Any unsupported ECX values return zero. Registers currently supported by ECX values are: 

• ECX Value 0 = Register MPERF 

• ECX Value 1 = Register APERF 

When virtualization is enabled, this instruction can be intercepted by the Hypervisor. The intercept bit 
is at VMCB byte offset lOh, bit 14. 


Mnemonic Opcode Description 

RDPRU OF 01 FD Copy register specified by ECX into EDX:EAX 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




0 

0 

0 

0 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

Instruction not supported by 

CPUID Fn8000 0008 EBX[RDPRU] = 0 or CPL>0 and 
CR4.TSD=1. 
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RDRAND Read Random 

Loads the destination register with a hardware-generated random value. 

The size of the returned value in bits is detennined by the size of the destination register. 

Hardware modifies the CF flag to indicate whether the value returned in the destination register is 
valid. If CF = 1, the value is valid. If CF = 0, the value is invalid. Software must test the state of the CF 
flag prior to using the value returned in the destination register to detennine if the value is valid. If the 
returned value is invalid, software must execute the instruction again. Software should implement a 
retry limit to ensure forward progress of code. 

The execution of RDRAND clears the OF, SF, ZF, AF, and PF flags. 

Support for the RDRAND instruction is optional. On processors that support the instruction, CPUID 
FnOOOOOOO 1_ECX[RDRAND] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic 

RDRAND reg16 

RDRAND reg32 

RDRAND reg64 

Related Instructions 

RDSEED 


Opcode 

OF C7 /6 

OF C7 16 
OF C7 16 


Description 

Load the destination register with a 16-bit random 
number. 

Load the destination register with a 32-bit random 
number. 

Load the destination register with a 64-bit random 
number. 


rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




0 

0 

0 

0 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

Instruction not supported as indicated by 

CPUID Fn0000_0001_ECX[RDRAND] = 0. 
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RDSEED Read Random Seed 

Loads the destination register with a hardware-generated random “seed” value. 

The size of the returned value in bits is detennined by the size of the destination register. 

Hardware modifies the CF flag to indicate whether the value returned in the destination register is 
valid. If CF = 1, the value is valid. If CF = 0, the value is invalid and will be returned as zero. Software 
must test the state of the CF flag prior to using the value returned in the destination register to 
determine if the value is valid. If the returned value is invalid, software must execute the instruction 
again. Software should implement a retry limit to ensure forward progress of code. 

The execution of RDSEED clears the OF, SF, ZF, AF, and PF flags. 


Mnemonic 

Opcode 

Description 

RDSEED reg16 

OF C7 17 

Read 16-bit random seed 

RDSEED reg32 

OF C7 17 

Read 32-bit random seed 

RDSEED reg64 

OF C7 17 

Read 64-bit random seed 


Related Instructions 

RDRAND 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




0 

0 

0 

0 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are 
blank. Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

Instruction not supported as indicated by CPUID 
Fn0000_0007_EBX_x0[RDSEED] = 0 
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RET (Near) Near Return from Called Procedure 

Returns from a procedure previously entered by a CALL near instruction. This form of the RET 
instruction returns to a calling procedure within the current code segment. 

This instruction pops the rIP from the stack, with the size of the pop determined by the operand size. 
The new rIP is then zero-extended to 64 bits. The RET instruction can accept an immediate value 
operand that it adds to the rSP after it pops the target rIP. This action skips over any parameters 
previously passed back to the subroutine that are no longer needed. 

In 64-bit mode, the operand size defaults to 64 bits (eight bytes) without the need for a REX prefix. No 
prefix is available to encode a 32-bit operand size in 64-bit mode. 

See RET (Far) for information on far returns—returns to procedures located outside of the current 
code segment. For details about control-flow instructions, see “Control Transfers” in Volume 1, and 
“Control-Transfer Privilege Checks” in Volume 2. 


Mnemonic Opcode 

RET C3 

RET imm 16 C2 iw 

Related Instructions 

CALL (Near), CALL (Far), RET (Far) 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

The target offset exceeded the code segment limit or was non- 
canonical. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 


Description 

Near return to the calling procedure. 

Near return to the calling procedure then pop the 
specified number of bytes from the stack. 
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RET (Far) Far Return from Called Procedure 

Returns from a procedure previously entered by a CALL Far instruction. This form of the RET 
instruction returns to a calling procedure in a different segment than the current code segment. It can 
return to the same CPL or to a less privileged CPL. 

RET Far pops a target CS and rIP from the stack. If the new code segment is less privileged than the 
current code segment, the stack pointer is incremented by the number of bytes indicated by the 
immediate operand, if present; then a new SS and rSP are also popped from the stack. 

The final value of rSP is incremented by the number of bytes indicated by the immediate operand, if 
present. This action skips over the parameters (previously passed to the subroutine) that are no longer 
needed. 

All stack pops are determined by the operand size. If necessary, the target rIP is zero-extended to 64 
bits before assuming program control. 

If the CPL changes, the data segment selectors are set to NULL for any of the data segments (DS, ES, 
FS, GS) not accessible at the new CPL. 

See RET (Near) for information on near returns—returns to procedures located inside the current code 
segment. For details about control-flow instructions, see “Control Transfers” in Volume 1, and 
“Control-Transfer Privilege Checks” in Volume 2. 


Mnemonic Opcode 

RETF CB 

RETF imm16 CA iw 

Action 

// Far returns (RETF) 

// See "Pseudocode Definition" on page 57. 

RETF_START: 

IF (REAL_MODE) 

RETF_REAL_OR_VIRTUAL 
ELSIF (PROTECTED_MODE) 

RETF_PROTECTED 
ELSE // (VIRTUAL_MODE) 

RETF_REAL_OR_VIRTUAL 

RETF_REAL_OR_VIRTUAL: 

IF (OPCODE == retf imml6) 

temp IMM = word-sized immediate specified in the instruction, 
zero-extended to 64 bits 


Description 

Far return to the calling procedure. 

Far return to the calling procedure, then pop the 
specified number of bytes from the stack. 
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ELSE // (OPCODE == retf) 
temp IMM = 0 

POP.v temp RIP 
POP.v temp_CS 

IF (temp_RIP > CS.limit) 
EXCEPTION [#GP(0)] 

CS.sel = tempjCS 

CS.base = temp JDS SHL 4 

RSP.s = RSP + temp IMM 

RIP = temp RIP 

EXIT 


RETF_PROTECTED: 

IF (OPCODE == retf imm!6) 

temp_IMM = word-sized immediate specified in the instruction, 
zero-extended to 64 bits 
ELSE // (OPCODE == retf) 
temp_IMM = 0 

POP.v temp RIP 
POP.v temp CS 

tempJDPL = temp_CS.rpl 

IF (CPL==tempJDPL) 

{ 

CS = READJDESCRIPTOR (temp JDS, iretj:hk) 

RSP.s = RSP + temp IMM 

IF ((64BIT MODE) && (temp RIP is non-canonical) 

|| (!64BITJMODE) && (temp_RIP > CS.limit)) 

EXCEPTION [#GP(0)] 

RIP = temp RIP 
EXIT 

} 

ELSE // (CPL!=tempjCPL) 

{ 

RSP.s = RSP + temp IMM 

POP.v temp RSP 
POP.v temp SS 

CS = READJDESCRIPTOR (temp JDS, iretj:hk) 
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CPL = temp_CPL 

IF ((64BIT MODE) && (temp RIP is non-canonical) 

|| (!64BIT_MODE) && (temp_RIP > CS.limit)) 

EXCEPTION [#GP(0)] 

SS = READJDESCRIPTOR (temp_SS, ss_chk) 

RSP.s = temp RSP + temp_IMM 

IF (changing CPL) 

{ 

FOR (seg = ES, DS, FS, GS) 

IF ((seg.attr.dpi < CPL) && ((seg.attr.type == 'data') 

II (seg.attr.type == 'non-conforming-code'))) 

{ 

seg = NULL // can't use lower dpi data segment at higher cpl 

} 

} 

RIP = temp RIP 
EXIT 

} 

Related Instructions 

CALL (Near), CALL (Far), RET (Near) 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Segment not 
present, #NP 
(selector) 



X 

The return code segment was marked not present. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

Stack, #SS 
(selector) 



X 

The return stack segment was marked not present. 

General protection, 
#GP 

X 

X 

X 

The target offset exceeded the code segment limit or was non- 
canonical. 
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Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 




X 

The return code selector was a null selector. 




X 

The return stack selector was a null selector and the return 
mode was non-64-bit mode or CPL was 3. 




X 

The return code or stack descriptor exceeded the descriptor 
table limit. 




X 

The return code or stack selector’s Tl bit was set but the LDT 
selector was a null selector. 




X 

The segment descriptor for the return code was not a code 
segment. 

General protection, 
#GP 

(selector) 



X 

The RPL of the return code segment selector was less than 
the CPL. 



X 

The return code segment was non-conforming and the 
segment selector’s DPL was not equal to the RPL of the code 
segment’s segment selector. 




X 

The return code segment was conforming and the segment 
selector’s DPL was greater than the RPL of the code 
segment’s segment selector. 




X 

The segment descriptor for the return stack was not a writable 
data segment. 




X 

The stack segment descriptor DPL was not equal to the RPL 
of the return code segment selector. 




X 

The stack segment selector RPL was not equal to the RPL of 
the return code segment selector. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned-memory reference was performed while 
alignment checking was enabled. 
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ROL Rotate Left 

Rotates the bits of a register or memory location (first operand) to the left (toward the more significant 
bit positions) by the number of bit positions in an unsigned immediate value or the CL register (second 
operand). The bits rotated out left are rotated back in at the right end (lsb) of the first operand location. 

The processor masks the upper three bits of the count operand, thus restricting the count to a number 
between 0 and 31. When the destination is 64 bits wide, it masks the upper two bits of the count, 
providing a count in the range of 0 to 63. 

After completing the rotation, the instruction sets the CF flag to the last bit rotated out (the lsb of the 
result). For 1-bit rotates, the instruction sets the OF flag to the logical xor of the CF bit (after the 
rotate) and the most significant bit of the result. When the rotate count is greater than 1, the OF flag is 
undefined. When the rotate count is 0, no flags are affected. 


Mnemonic 

Opcode 

ROL reg/mem8, 1 

DO 10 

ROL reg/mem8, CL 

D2/0 

ROL reg/mem8, imm8 

CO 10 ib 

ROL reg/mem16, 1 

D1 10 

ROL reg/mem16, CL 

D3 10 

ROL reg/mem16, imm8 

Cl 10 ib 

ROL reg/mem32, 1 

D1 10 

ROL reg/mem32, CL 

D3 10 

ROL reg/mem32, imm8 

Cl 10 ib 

ROL reg/mem64 , 1 

D1 10 

ROL reg/mem64, CL 

D3/0 

ROL reg/mem64, imm8 

Cl 10 ib 


Related Instructions 

RCL, RCR, ROR 


Description 

Rotate an 8-bit register or memory operand left 1 bit. 

Rotate an 8-bit register or memory operand left the 
number of bits specified in the CL register. 

Rotate an 8-bit register or memory operand left the 
number of bits specified by an 8-bit immediate value 

Rotate a 16-bit register or memory operand left 1 bit. 

Rotate a 16-bit register or memory operand left the 
number of bits specified in the CL register. 

Rotate a 16-bit register or memory operand left the 
number of bits specified by an 8-bit immediate value 

Rotate a 32-bit register or memory operand left 1 bit. 

Rotate a 32-bit register or memory operand left the 
number of bits specified in the CL register. 

Rotate a 32-bit register or memory operand left the 
number of bits specified by an 8-bit immediate value 

Rotate a 64-bit register or memory operand left 1 bit. 

Rotate a 64-bit register or memory operand left the 
number of bits specified in the CL register. 

Rotate a 64-bit register or memory operand left the 
number of bits specified by an 8-bit immediate value 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 








M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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ROR Rotate Right 

Rotates the bits of a register or memory location (first operand) to the right (toward the less significant 
bit positions) by the number of bit positions in an unsigned immediate value or the CL register (second 
operand). The bits rotated out right are rotated back in at the left end (the most significant bit) of the 
first operand location. 

The processor masks the upper three bits of the count operand, thus restricting the count to a number 
between 0 and 31. When the destination is 64 bits wide, the processor masks the upper two bits of the 
count, providing a count in the range of 0 to 63. 

After completing the rotation, the instruction sets the CF flag to the last bit rotated out (the most 
significant bit of the result). For 1-bit rotates, the instruction sets the OF flag to the logical xor of the 
two most significant bits of the result. When the rotate count is greater than 1, the OF flag is undefined. 
When the rotate count is 0, no flags are affected. 


Mnemonic 

Opcode 

Description 

ROR reg/mem8, 1 

DO/I 

Rotate an 8-bit register or memory location right 1 bit 

ROR reg/mem8, CL 

D2/1 

Rotate an 8-bit register or memory location right the 
number of bits specified in the CL register. 

ROR reg/mem8, imm8 

CO /I ib 

Rotate an 8-bit register or memory location right the 
number of bits specified by an 8-bit immediate value. 

ROR reg/mem16 , 1 

D1 /I 

Rotate a 16-bit register or memory location right 1 bit 

ROR reg/mem16, CL 

D3/1 

Rotate a 16-bit register or memory location right the 
number of bits specified in the CL register. 

ROR reg/mem16, imm8 

Cl /I ib 

Rotate a 16-bit register or memory location right the 
number of bits specified by an 8-bit immediate value. 

ROR reg/mem32, 1 

D1 /I 

Rotate a 32-bit register or memory location right 1 bit 

ROR reg/mem32, CL 

D3 /I 

Rotate a 32-bit register or memory location right the 
number of bits specified in the CL register. 

ROR reg/mem32, imm8 

Cl /I ib 

Rotate a 32-bit register or memory location right the 
number of bits specified by an 8-bit immediate value. 

ROR reg/mem64, 1 

D1 /I 

Rotate a 64-bit register or memory location right 1 bit 

ROR reg/mem64, CL 

D3/1 

Rotate a 64-bit register or memory operand right the 
number of bits specified in the CL register. 

ROR reg/mem64, imm8 

Cl /I ib 

Rotate a 64-bit register or memory operand right the 
number of bits specified by an 8-bit immediate value. 

Related Instructions 




RCL, RCR, ROL 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 








M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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RORX Rotate Right Extended 

Rotates the bits of the source operand right (toward the least-significant bit) by the number of bit 
positions specified in an immediate operand and writes the result to the destination. Does not affect the 
arithmetic flags. 

This instruction has three operands: 

RORX dest, src, rot_cnt 

On each right-shift, the bit shifted out of the least-significant bit position is copied to the most- 
significant bit. This instruction performs a non-destructive operation; that is, the contents of the source 
operand are unaffected by the operation, unless the destination and source are the same general- 
purpose register. 

In 64-bit mode, the operand size is determined by the value of VEX. W. If VEX. W is 1, the operand 
size is 64 bits; if VEX. W is 0, the operand size is 32 bits. In 32-bit mode, VEX.W is ignored. 16-bit 
operands are not supported. 

The destination (dest) is a general-purpose register and the source (src) is either a general-purpose 
register or a memory operand. The rotate count rot_cnt is encoded in an immediate byte. When the 
operand size is 32, bits [7:5] of the immediate byte are ignored; when the operand size is 64, bits [7:6] 
of the immediate byte are ignored. 

This instruction is a BMI2 instruction. Support for this instruction is indicated by CPUID 
Fn0000_0007_EBX_x0[BMI2] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



VEX 

RXB.mapselect 

W.vvvv.L.pp 

Opcode 

RORX reg32, reg/mem32, imm8 

C4 

RXB.03 

0.1111.0.11 

FO /r ib 

RORX reg64, reg/mem64, imm8 

C4 

RXB.03 

1.1111.0.11 

FO /r ib 


Related Instructions 

SARX, SHLX, SHRX 

rFLAGS Affected 

None. 
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Exceptions 


Exception 

Mode 

Cause of Exception 

Real 

Virtual 

8086 

Protected 

Invalid opcode, #UD 

X 

X 


BMI2 instructions are only recognized in protected mode. 



X 

BMI2 instructions are not supported, as indicated by 
CPUID Fn0000_0007_EBX_x0[BMI2] = 0. 



X 

VEX.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 


304 


RORX 


General-Purpose 
Instruction Reference 





24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


SAHF Store AH into Flags 

Loads the SF, ZF, AF, PF, and CF flags of the EFLAGS register with values from the corresponding 
bits in the AH register (bits 7, 6, 4, 2, and 0, respectively). The instruction ignores bits 1, 3, and 5 of 
register AH; it sets those bits in the EFLAGS register to 1,0, and 0, respectively. 

The SAHF instruction is available in 64-bit mode if CPUID Fn8000_0001_ECX[LahfSahf] = 1. It is 
always available in the other operating modes (including compatibility mode) 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic 


SAHF 


Opcode Description 

Loads the sign flag, the zero flag, the auxiliary flag, the 
9E parity flag, and the carry flag from the AH register into 

the lower 8 bits of the EFLAGS register. 


Related Instructions 

LAHF 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 













M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 



X 

The SAHF instruction is not supported in 64-bit mode, as 
indicated by CPUID Fn8000_0001_ECX[LahfSahf] = 0. 
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SAL Shift Left 

SHL 

Shifts the bits of a register or memory location (first operand) to the left through the CF bit by the 
number of bit positions in an unsigned immediate value or the CL register (second operand). The 
instruction discards bits shifted out of the CF flag. For each bit shift, the SAL instruction clears the 
least-significant bit to 0. At the end of the shift operation, the CF flag contains the last bit shifted out of 
the first operand. 

The processor masks the upper three bits of the count operand, thus restricting the count to a number 
between 0 and 31. When the destination is 64 bits wide, the processor masks the upper two bits of the 
count, providing a count in the range of 0 to 63. 

The effect of this instruction is multiplication by powers of two. 

For 1-bit shifts, the instruction sets the OF flag to the logical xor of the CF bit (after the shift) and the 
most significant bit of the result. When the shift count is greater than 1, the OF flag is undefined. 

If the shift count is 0, no flags are modified. 

SHL is an alias to the SAL instruction. 


Mnemonic 

Opcode 

Description 

SAL reg/mem8, 1 

DOM 

Shift an 8-bit register or memory location left 1 bit. 

SAL reg/mem8, CL 

D2M 

Shift an 8-bit register or memory location left the number 
of bits specified in the CL register. 

SAL reg/mem8, imm8 

CO M ib 

Shift an 8-bit register or memory location left the number 
of bits specified by an 8-bit immediate value. 

SAL reg/meml6 , 1 

D1 M 

Shift a 16-bit register or memory location left 1 bit. 

SAL reg/meml6 , CL 

D3M 

Shift a 16-bit register or memory location left the number 
of bits specified in the CL register. 

SAL reg/meml6 , imm8 

Cl M ib 

Shift a 16-bit register or memory location left the number 
of bits specified by an 8-bit immediate value. 

SAL reg/mem32 , 1 

D1 M 

Shift a 32-bit register or memory location left 1 bit. 

SAL reg/mem32, CL 

D3M 

Shift a 32-bit register or memory location left the number 
of bits specified in the CL register. 

SAL reg/mem32, imm8 

Cl M ib 

Shift a 32-bit register or memory location left the number 
of bits specified by an 8-bit immediate value. 

SAL reg/mem64, 1 

D1 M 

Shift a 64-bit register or memory location left 1 bit. 

SAL reg/mem64, CL 

D3M 

Shift a 64-bit register or memory location left the number 
of bits specified in the CL register. 

SAL reg/mem64, imm8 

Cl M ib 

Shift a 64-bit register or memory location left the number 
of bits specified by an 8-bit immediate value. 
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Mnemonic 

Opcode 

Description 

SHL reg/mem8, 1 

DOM 

Shift an 8-bit register or memory location by 1 bit. 

SHL reg/mem8, CL 

D2M 

Shift an 8-bit register or memory location left the number 
of bits specified in the CL register. 

SHL reg/mem8, imm8 

CO M ib 

Shift an 8-bit register or memory location left the number 
of bits specified by an 8-bit immediate value. 

SHL reg/mem16, 1 

D1 M 

Shift a 16-bit register or memory location left 1 bit. 

SHL reg/mem16, CL 

D3M 

Shift a 16-bit register or memory location left the number 
of bits specified in the CL register. 

SHL reg/mem16, imm8 

Cl M ib 

Shift a 16-bit register or memory location left the number 
of bits specified by an 8-bit immediate value. 

SHL reg/mem32, 1 

D1 M 

Shift a 32-bit register or memory location left 1 bit. 

SHL reg/mem32, CL 

D3M 

Shift a 32-bit register or memory location left the number 
of bits specified in the CL register. 

SHL reg/mem32, imm8 

Cl M ib 

Shift a 32-bit register or memory location left the number 
of bits specified by an 8-bit immediate value. 

SHL reg/mem64, 1 

D1 M 

Shift a 64-bit register or memory location left 1 bit. 

SHL reg/mem64, CL 

D3M 

Shift a 64-bit register or memory location left the number 
of bits specified in the CL register. 

SHL reg/mem64, imm8 

Cl M ib 

Shift a 64-bit register or memory location left the number 
of bits specified by an 8-bit immediate value. 


Related Instructions 

SAR, SHR, SHLD, SHRD 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




M 

M 

U 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 1 5, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 


X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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SAR Shift Arithmetic Right 

Shifts the bits of a register or memory location (first operand) to the right through the CF bit by the 
number of bit positions in an unsigned immediate value or the CL register (second operand). The 
instruction discards bits shifted out of the CF flag. At the end of the shift operation, the CF flag 
contains the last bit shifted out of the first operand. 

The SAR instruction does not change the sign bit of the target operand. For each bit shift, it copies the 
sign bit to the next bit, preserving the sign of the result. 

The processor masks the upper three bits of the count operand, thus restricting the count to a number 
between 0 and 31. When the destination is 64 bits wide, the processor masks the upper two bits of the 
count, providing a count in the range of 0 to 63. 

For 1-bit shifts, the instruction clears the OF flag to 0. When the shift count is greater than 1, the OF 
flag is undefined. 

If the shift count is 0, no flags are modified. 

Although the SAR instruction effectively divides the operand by a power of 2, the behavior is different 
from the IDIV instruction. For example, shifting -11 (FFFFFFF5h) by two bits to the right (that is, 
divide -11 by 4), gives a result of FFFFFFFDh, or -3, whereas the IDIV instruction for dividing -11 
by 4 gives a result of-2. This is because the IDIV instruction rounds off the quotient to zero, whereas 
the SAR instruction rounds off the remainder to zero for positive dividends and to negative infinity for 
negative dividends. So, for positive operands, SAR behaves like the corresponding IDIV instruction. 
For negative operands, it gives the same result if and only if all the shifted-out bits are zeroes; 
otherwise, the result is smaller by 1. 


Mnemonic 

Opcode 

Description 

SAR reg/mem8, 1 

DO/7 

Shift a signed 8-bit register or memory operand right 1 
bit. 

SAR reg/mem8, CL 

D2/7 

Shift a signed 8-bit register or memory operand right the 
number of bits specified in the CL register. 

SAR reg/mem8, imm8 

CO 17 ib 

Shift a signed 8-bit register or memory operand right the 
number of bits specified by an 8-bit immediate value. 

SAR reg/mem16 , 1 

D1 17 

Shift a signed 16-bit register or memory operand right 1 
bit. 

SAR reg/mem16, CL 

D3/7 

Shift a signed 16-bit register or memory operand right 
the number of bits specified in the CL register. 

SAR reg/mem16, imm8 

Cl 17 ib 

Shift a signed 16-bit register or memory operand right 
the number of bits specified by an 8-bit immediate 
value. 

SAR reg/mem32, 1 

D1 17 

Shift a signed 32-bit register or memory location 1 bit. 

SAR reg/mem32, CL 

D3/7 

Shift a signed 32-bit register or memory location right 
the number of bits specified in the CL register. 
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Mnemonic 

Opcode 

Description 

SAR reg/mem32, imm8 

Cl 17 ib 

Shift a signed 32-bit register or memory location right 
the number of bits specified by an 8-bit immediate 
value. 

SAR reg/mem64, 1 

D1 17 

Shift a signed 64-bit register or memory location right 1 
bit. 

SAR reg/mem64, CL 

D3/7 

Shift a signed 64-bit register or memory location right 
the number of bits specified in the CL register. 

SAR reg/mem64, imm8 

Cl 17 ib 

Shift a signed 64-bit register or memory location right 
the number of bits specified by an 8-bit immediate 
value. 


Related Instructions 

SAL, SHL, SHR, SHLD, SHRD 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




M 

M 

U 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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SARX Shift Right Arithmetic Extended 

Shifts the bits of the first source operand right (toward the least-significant bit) arithmetically by the 
number of bit positions specified in the second source operand and writes the result to the destination. 
Does not affect the arithmetic flags. 

This instruction has three operands: 

SARX dest, src, shft_cnt 

On each right-shift, the most-significant bit (the sign bit) is replicated. This instruction performs a non¬ 
destructive operation; that is, the contents of the source operand are unaffected by the operation, unless 
the destination and source are the same general-purpose register. 

In 64-bit mode, the operand size is determined by the value of VEX. W. If VEX. W is 1, the operand 
size is 64 bits; if VEX. W is 0, the operand size is 32 bits. In 32-bit mode, VEX.W is ignored. 16-bit 
operands are not supported. 

The destination (dest) is a general-purpose register and the first source (src) is either a general-purpose 
register or a memory operand. The second source operand shft_cnt is a general-purpose register. When 
the operand size is 32, bits [31:5] of shft_cnt are ignored; when the operand size is 64, bits [63:6] of 
shft_cnt are ignored. 

This instruction is a BMI2 instruction. Support for this instruction is indicated by CPUID 
Fn0000_0007_EBX_x0[BMI2] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



VEX 

RXB.mapselect 

W.vvvv.L.pp 

Opcode 

SARX reg32, reg/mem32, reg32 

C4 

RXB.02 

0.src2.0.10 

F7 /r 

SARX reg64, reg/mem64, reg64 

C4 

RXB.02 

1.src2.0.10 

F7 /r 


Related Instructions 

RORX, SHLX, SHRX 

rFLAGS Affected 

None. 
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Exceptions 


Exception 

Mode 

Cause of Exception 

Real 

Virtual 

8086 

Protected 

Invalid opcode, #UD 

X 

X 


BMI2 instructions are only recognized in protected mode. 



X 

BMI2 instructions are not supported, as indicated by 
CPUID Fn0000_0007_EBX_x0[BMI2] = 0. 



X 

VEX.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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SBB Subtract with Borrow 

Subtracts an immediate value or the value in a register or a memory location (second operand) from a 
register or a memory location (first operand), and stores the result in the first operand location. If the 
carry flag (CF) is 1, the instruction subtracts 1 from the result. Otherwise, it operates like SUB. 

The SBB instruction sign-extends immediate value operands to the length of the first operand size. 

This instruction evaluates the result for both signed and unsigned data types and sets the OF and CF 
flags to indicate a borrow in a signed or unsigned result, respectively. It sets the SF flag to indicate the 
sign of a signed result. 

This instruction is useful for multibyte (multiword) numbers because it takes into account the borrow 
from a previous SUB instruction. 

The forms of the SBB instruction that write to memory support the LOCK prefix. For details about the 
LOCK prefix, see “Lock Prefix” on page 11. 


Mnemonic 

Opcode 

Description 

SBB AL, imm8 

1C ib 

Subtract an immediate 8-bit value from the AL register 
with borrow. 

SBB AX, imm16 

ID iw 

Subtract an immediate 16-bit value from the AX register 
with borrow. 

SBB EAX, imm32 

ID id 

Subtract an immediate 32-bit value from the EAX 
register with borrow. 

SBB RAX, imm32 

ID id 

Subtract a sign-extended immediate 32-bit value from 
the RAX register with borrow. 

SBB reg/mem8, imm8 

80 13 ib 

Subtract an immediate 8-bit value from an 8-bit register 
or memory location with borrow. 

SBB reg/mem16, imm16 

81/3 iw 

Subtract an immediate 16-bit value from a 16-bit register 
or memory location with borrow. 

SBB reg/mem32, imm32 

81 13 id 

Subtract an immediate 32-bit value from a 32-bit register 
or memory location with borrow. 

SBB reg/mem64, imm32 

81 13 id 

Subtract a sign-extended immediate 32-bit value from a 
64-bit register or memory location with borrow. 

SBB reg/mem16, imm8 

83 13 ib 

Subtract a sign-extended 8-bit immediate value from a 
16-bit register or memory location with borrow. 

SBB reg/mem32, imm8 

83 13 ib 

Subtract a sign-extended 8-bit immediate value from a 
32-bit register or memory location with borrow. 

SBB reg/mem64, imm8 

83 13 ib 

Subtract a sign-extended 8-bit immediate value from a 
64-bit register or memory location with borrow. 

SBB reg/mem8, reg8 

18/r 

Subtract the contents of an 8-bit register from an 8-bit 
register or memory location with borrow. 

SBB reg/mem16 , reg16 

19/r 

Subtract the contents of a 16-bit register from a 16-bit 
register or memory location with borrow. 
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Mnemonic 

Opcode 

Description 

SBB reg/mem32, reg32 

19 /r 

Subtract the contents of a 32-bit register from a 32-bit 
register or memory location with borrow. 

SBB reg/mem64, reg64 

19 /r 

Subtract the contents of a 64-bit register from a 64-bit 
register or memory location with borrow. 

SBB reg8, reg/mem8 

lA/r 

Subtract the contents of an 8-bit register or memory 
location from the contents of an 8-bit register with 
borrow. 

SBB reg16, reg/mem16 

IB /r 

Subtract the contents of a 16-bit register or memory 
location from the contents of a 16-bit register with 
borrow. 

SBB reg32, reg/mem32 

IB /r 

Subtract the contents of a 32-bit register or memory 
location from the contents of a 32-bit register with 
borrow. 

SBB reg64 , reg/mem64 

IB /r 

Subtract the contents of a 64-bit register or memory 
location from the contents of a 64-bit register with 


borrow. 

Related Instructions 

SUB, ADD, ADC 


rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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SCAS Scan String 

SCASB 

SCASW 

SCASD 

SCASQ 

Compares the AL, AX, EAX, or RAX register with the byte, word, doubleword, or quadword pointed 
to by ES:rDI, sets the status flags in the rFLAGS register according to the results, and then increments 
or decrements the rDI register according to the state of the DF flag in the rFLAGS register. 

If the DF flag is 0, the instruction increments the rDI register; otherwise, it decrements it. The 
instruction increments or decrements the rDI register by 1,2, 4, or 8, depending on the size of the 
operands. 

The forms of the SCASx instruction with an explicit operand address the operand at ES:rDI. The 
explicit operand serves only to specify the size of the values being compared. 

The no-operands forms of the instruction use the ES:rDI registers to point to the value to be compared. 
The mnemonic determines the size of the operands and the specific register containing the other 
comparison value. 

For block comparisons, the SCASx instructions support the REPE or REPZ prefixes (they are 
synonyms) and the REPNE or REPNZ prefixes (they are synonyms). For details about the REP 
prefixes, see “Repeat Prefixes” on page 12. A SCASx instruction can also operate inside a loop 
controlled by the LOOPcc instruction. 


Mnemonic 

Opcode 

Description 

SCAS mem8 

AE 

Compare the contents of the AL register with the byte at 
ES:rDI, and then increment or decrement rDI. 

SCAS mem 16 

AF 

Compare the contents of the AX register with the word 
at ES:rDI, and then increment or decrement rDI. 

SCAS mem32 

AF 

Compare the contents of the EAX register with the 
doubleword at ES:rDI, and then increment or decrement 
rDI. 

SCAS mem64 

AF 

Compare the contents of the RAX register with the 
quadword at ES:rDI, and then increment or decrement 
rDI. 

SCASB 

AE 

Compare the contents of the AL register with the byte at 
ES:rDI, and then increment or decrement rDI. 

SCASW 

AF 

Compare the contents of the AX register with the word 
at ES:rDI, and then increment or decrement rDI. 
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Mnemonic 

Opcode 

Description 

SCASD 

AF 

Compare the contents of the EAX register with the 
doubleword at ES:rDI, and then increment or decrement 
rDI. 

SCASQ 

AF 

Compare the contents of the RAX register with the 
quadword at ES:rDI, and then increment or decrement 
rDI. 


Related Instructions 

CMP, CMPSx 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 
#GP 



X 

A null ES segment was used to reference memory. 

X 

X 

X 

A memory address exceeded the ES segment limit or was 
non-canonical. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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SETcc Set Byte on Condition 

Checks the status flags in the rFLAGS register and, if the flags meet the condition specified in the 
mnemonic ( cc ), sets the value in the specified 8-bit memory location or register to 1. If the flags do not 
meet the specified condition, SETcc clears the memory location or register to 0. 

Mnemonics with the A (above) and B (below) tags are intended for use when performing unsigned 
integer comparisons; those with G (greater) and L (less) tags are intended for use with signed integer 
comparisons. 

Software typically uses the SETcc instructions to set logical indicators. Like the CMOVcc instructions 
(page 147), the SETcc instructions can replace two instructions—a conditional jump and a move. 
Replacing conditional jumps with conditional sets can help avoid branch-prediction penalties that may 
result from conditional jumps. 

If the logical value “true” (logical one) is represented in a high-level language as an integer with all 
bits set to 1, software can accomplish such representation by first executing the opposite SETcc 
instruction—for example, the opposite of SETZ is SETNZ—and then decrementing the result. 

A ModR/M byte is used to identify the operand. The reg field in the ModR/M byte is unused. 


Mnemonic 

Opcode 

Description 

SETO reg/mem8 

OF 90 10 

Set byte if overflow (OF = 1). 

SETNO reg/mem8 

OF 91 10 

Set byte if not overflow (OF = 0). 

SETB reg/mem8 

SETC reg/mem8 

SETNAE reg/mem8 

OF 92 10 

Set byte if below (OF = 1). 

Set byte if carry (OF = 1). 

Set byte if not above or equal (OF = 1). 

SETNB reg/mem8 

SETNC reg/mem8 

SETAE reg/mem8 

OF 93 10 

Set byte if not below (OF = 0). 

Set byte if not carry (OF = 0). 

Set byte if above or equal (OF = 0). 

SETZ reg/mem8 

SETE reg/mem8 

OF 94 10 

Set byte if zero (ZF = 1). 

Set byte if equal (ZF = 1). 

SETNZ reg/mem8 

SETNE reg/mem8 

OF 95 10 

Set byte if not zero (ZF = 0). 

Set byte if not equal (ZF = 0). 

SETBE reg/mem8 

SETNA reg/mem8 

OF 96 10 

Set byte if below or equal (OF = 1 or ZF = 1). 

Set byte if not above (OF = 1 or ZF = 1). 

SETNBE reg/mem8 

SETA reg/mem8 

OF 97 10 

Set byte if not below or equal (OF = 0 and ZF = 0) 
Set byte if above (OF = 0 and ZF = 0). 

SETS reg/mem8 

OF 98 10 

Set byte if sign (SF = 1). 

SETNS reg/mem8 

OF 99 10 

Set byte if not sign (SF = 0). 

SETP reg/mem8 

SETPE reg/mem8 

OF 9A/0 

Set byte if parity (PF = 1). 

Set byte if parity even (PF = 1). 

SETNP reg/mem8 

SETPO reg/mem8 

OF 9B 10 

Set byte if not parity (PF = 0). 

Set byte if parity odd (PF = 0). 
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Mnemonic 

Opcode 

Description 

SETL reg/mem8 

SETNGE reg/mem8 

OF 9C /0 

Set byte if less (SF <> OF). 

Set byte if not greater or equal (SF <> OF). 

SETNL reg/mem8 

SETGE reg/me m 8 

OF 9D 10 

Set byte if not less (SF = OF). 

Set byte if greater or equal (SF = OF). 

SETLE reg/mem8 

SETNG reg/mem8 

OF 9E 10 

Set byte if less or equal (ZF = 1 or SF <> OF). 

Set byte if not greater (ZF = 1 or SF <> OF). 

SETNLE reg/mem8 

SETG reg/mem8 

OF 9F/0 

Set byte if not less or equal (ZF = 0 and SF = OF) 
Set byte if greater (ZF = 0 and SF = OF). 


Related Instructions 

None 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 
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SFENCE Store Fence 

Acts as a barrier to force strong memory ordering (serialization) between store instructions preceding 
the SFENCE and store instructions that follow the SFENCE. Stores to differing memory types, or 
within the WC memory type, may become visible out of program order; the SFENCE instruction 
ensures that the system completes all previous stores in such a way that they are globally visible before 
executing subsequent stores. This includes emptying the store buffer and all write-combining buffers. 

The SFENCE instruction is weakly-ordered with respect to load instructions, data and instruction 
prefetches, and the LFENCE instruction. Speculative loads initiated by the processor, or specified 
explicitly using cache-prefetch instructions, can be reordered around an SFENCE. 

In addition to store instructions, SFENCE is strongly ordered with respect to other SFENCE 
instructions, MFENCE instructions, and serializing instructions. Further details on the use of 
MFENCE to order accesses among differing memory types may be found in AMD64 Architecture 
Programmer s Manual Volume 2: System Programming, section 7.4 “Memory Types” on page 172. 

The SFENCE instruction is an SSE1 instruction. Support for SSE1 instructions is indicated by CPUID 
Fn0000_0001_EDX[SSE] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Opcode 

SFENCE 0FAEF8 

Related Instructions 

| LFENCE, MFENCE, MCOMMIT 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid Opcode, 

#UD 

X 

X 

X 

The SSE instructions are not supported, as indicated by EDX 
bit 25 of CPUID function 0000_0001h; and the AMD 
extensions to MMX are not supported, as indicated by EDX bit 
22 of CPUID function 8000_0001h. 


Description 

Force strong ordering of (serialized) store operations. 
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SHL Shift Left 

This instruction is synonymous with the SAL instruction. For information, see “SAL SHL” on 
page 306. 
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SHLD Shift Left Double 

Shifts the bits of a register or memory location (first operand) to the left by the number of bit positions 
in an unsigned immediate value or the CL register (third operand), and shifts in a bit pattern (second 
operand) from the right. At the end of the shift operation, the CF flag contains the last bit shifted out of 
the first operand. 

The processor masks the upper three bits of the count operand, thus restricting the count to a number 
between 0 and 31. When the destination is 64 bits wide, the processor masks the upper two bits of the 
count, providing a count in the range of 0 to 63. If the masked count is greater than the operand size, 
the result in the destination register is undefined. 

If the shift count is 0, no flags are modified. 

If the count is 1 and the sign of the operand being shifted changes, the instruction sets the OF flag to 1. 
If the count is greater than 1, OF is undefined. 


Mnemonic 

Opcode 

Description 

SHLD reg/mem16, reg16, imm8 

OF A4 /rib 

Shift bits of a 16-bit destination register or memory 
operand to the left the number of bits specified in an 8- 
bit immediate value, while shifting in bits from the 
second operand. 

SHLD reg/mem16, reg16, CL 

OF A5 /r 

Shift bits of a 16-bit destination register or memory 
operand to the left the number of bits specified in the CL 
register, while shifting in bits from the second operand. 

SHLD reg/mem32, reg32 , imm8 

OF A4 /rib 

Shift bits of a 32-bit destination register or memory 
operand to the left the number of bits specified in an 8- 
bit immediate value, while shifting in bits from the 
second operand. 

SHLD reg/mem32, reg32, CL 

OF A5 /r 

Shift bits of a 32-bit destination register or memory 
operand to the left the number of bits specified in the CL 
register, while shifting in bits from the second operand. 

SHLD reg/mem64, reg64, imm8 

OF M/rib 

Shift bits of a 64-bit destination register or memory 
operand to the left the number of bits specified in an 8- 
bit immediate value, while shifting in bits from the 
second operand. 

SHLD reg/mem64, reg64, CL 

OF A5 /r 

Shift bits of a 64-bit destination register or memory 
operand to the left the number of bits specified in the CL 


register, while shifting in bits from the second operand. 

Related Instructions 

SHRD, SAL, SAR, SHR, SHL 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




M 

M 

U 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 


322 


SHLD 


General-Purpose 
Instruction Reference 








24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


SHLX Shift Left Logical Extended 

Shifts the bits of the first source operand left (toward the most-significant bit) by the number of bit 
positions specified in the second source operand and writes the result to the destination. Does not 
affect the arithmetic flags. 

This instruction has three operands: 

SHLX dest, src, shft_cnt 

On each left-shift, a zero is shifted into the least-significant bit position. This instruction performs a 
non-destructive operation; that is, the contents of the source operand are unaffected by the operation, 
unless the destination and source are the same general-purpose register. 

In 64-bit mode, the operand size is determined by the value of VEX. W. If VEX. W is 1, the operand 
size is 64 bits; if VEX. W is 0, the operand size is 32 bits. In 32-bit mode, VEX.W is ignored. 16-bit 
operands are not supported. 

The destination {dest) is a general-purpose register and the first source {src) is either a general-purpose 
register or a memory operand. The second source operand shft_cnt is a general-purpose register. When 
the operand size is 32, bits [31:5] of shft_cnt are ignored; when the operand size is 64, bits [63:6] of 
shft_cnt are ignored. 

This instruction is a BMI2 instruction. Support for this instruction is indicated by CPUID 
Fn0000_0007_EBX_x0[BMI2] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



VEX 

RXB.mapselect 

W.vvvv.L.pp 

Opcode 

SHLX reg32, reg/mem32, reg32 

C4 

RXB.02 

0.src2.0.01 

F7 /r 

SHLX reg64, reg/mem64, reg64 

C4 

RXB.02 

1.src2.0.01 

F7 /r 


Related Instructions 

RORX, SARX, SHRX 

rFLAGS Affected 

None. 
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Exceptions 


Exception 

Mode 

Cause of Exception 

Real 

Virtual 

8086 

Protected 

Invalid opcode, #UD 

X 

X 


BMI2 instructions are only recognized in protected mode. 



X 

BMI2 instructions are not supported, as indicated by 
CPUID Fn0000_0007_EBX_x0[BMI2] = 0. 



X 

VEX.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 


324 


SHLX 


General-Purpose 
Instruction Reference 





24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


SHR Shift Right 

Shifts the bits of a register or memory location (first operand) to the right through the CF bit by the 
number of bit positions in an unsigned immediate value or the CL register (second operand). The 
instruction discards bits shifted out of the CF flag. At the end of the shift operation, the CF flag 
contains the last bit shifted out of the first operand. 

For each bit shift, the instruction clears the most-significant bit to 0. 

The effect of this instruction is unsigned division by powers of two. 

The processor masks the upper three bits of the count operand, thus restricting the count to a number 
between 0 and 31. When the destination is 64 bits wide, the processor masks the upper two bits of the 
count, providing a count in the range of 0 to 63. 

For 1-bit shifts, the instruction sets the OF flag to the most-significant bit of the original value. If the 
count is greater than 1, the OF flag is undefined. 

If the shift count is 0, no flags are modified. 


Mnemonic 

Opcode 

Description 

SHR reg/mem8, 1 

DO/5 

Shift an 8-bit register or memory operand right 1 bit. 

SHR reg/mem8, CL 

D2/5 

Shift an 8-bit register or memory operand right the 
number of bits specified in the CL register. 

SHR reg/mem8, imm8 

CO 15 ib 

Shift an 8-bit register or memory operand right the 
number of bits specified by an 8-bit immediate value 

SHR reg/mem16, 1 

D1 15 

Shift a 16-bit register or memory operand right 1 bit. 

SHR reg/mem16, CL 

D3/5 

Shift a 16-bit register or memory operand right the 
number of bits specified in the CL register. 

SHR reg/mem16, imm8 

Cl 15 ib 

Shift a 16-bit register or memory operand right the 
number of bits specified by an 8-bit immediate value 

SHR reg/mem32, 1 

D1 15 

Shift a 32-bit register or memory operand right 1 bit. 

SHR reg/mem32, CL 

D3/5 

Shift a 32-bit register or memory operand right the 
number of bits specified in the CL register. 

SHR reg/mem32, imm8 

Cl 15 ib 

Shift a 32-bit register or memory operand right the 
number of bits specified by an 8-bit immediate value 

SHR reg/mem64, 1 

D1 15 

Shift a 64-bit register or memory operand right 1 bit. 

SHR reg/m em 64, CL 

D3 15 

Shift a 64-bit register or memory operand right the 
number of bits specified in the CL register. 

SHR reg/m em 64, imm8 

Cl 15 ib 

Shift a 64-bit register or memory operand right the 
number of bits specified by an 8-bit immediate value 
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Related Instructions 

SHL, SAL, SAR, SHLD, SHRD 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




M 

M 

U 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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SHRD Shift Right Double 

Shifts the bits of a register or memory location (first operand) to the right by the number of bit 
positions in an unsigned immediate value or the CL register (third operand), and shifts in a bit pattern 
(second operand) from the left. At the end of the shift operation, the CF flag contains the last bit shifted 
out of the first operand. 

The processor masks the upper three bits of the count operand, thus restricting the count to a number 
between 0 and 31. When the destination is 64 bits wide, the processor masks the upper two bits of the 
count, providing a count in the range of 0 to 63. If the masked count is greater than the operand size, 
the result in the destination register is undefined. 

If the shift count is 0, no flags are modified. 

If the count is 1 and the sign of the value being shifted changes, the instruction sets the OF flag to 1. If 
the count is greater than 1, the OF flag is undefined. 


Mnemonic Opcode 

SHRD reg/mem16, reg16, imm8 OF AC /rib 

SHRD reg/mem16, reg16, CL OF AD /r 

SHRD reg/mem32, reg32, imm8 OF AC /rib 

SHRD reg/mem32, reg32, CL OF AD /r 

SHRD reg/mem64, reg64, imm8 OF AC /r ib 

SHRD reg/mem64, reg64, CL OF AD /r 


Description 

Shift bits of a 16-bit destination register or memory 
operand to the right the number of bits specified in an 8 
bit immediate value, while shifting in bits from the 
second operand. 

Shift bits of a 16-bit destination register or memory 
operand to the right the number of bits specified in the 
CL register, while shifting in bits from the second 
operand. 

Shift bits of a 32-bit destination register or memory 
operand to the right the number of bits specified in an 8 
bit immediate value, while shifting in bits from the 
second operand. 

Shift bits of a 32-bit destination register or memory 
operand to the right the number of bits specified in the 
CL register, while shifting in bits from the second 
operand. 

Shift bits of a 64-bit destination register or memory 
operand to the right the number of bits specified in an 8 
bit immediate value, while shifting in bits from the 
second operand. 

Shift bits of a 64-bit destination register or memory 
operand to the right the number of bits specified in the 
CL register, while shifting in bits from the second 
operand. 


Related Instructions 

SHLD, SHR, SHL, SAR, SAL 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




M 

M 

U 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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SHRX Shift Right Logical Extended 

Shifts the bits of the first source operand right (toward the least-significant bit) by the number of bit 
positions specified in the second source operand and writes the result to the destination. Does not 
affect the arithmetic flags. 

This instruction has three operands: 

SHRX dest, src, shft_cnt 

On each right-shift, a zero is shifted into the most-significant bit position. This instruction performs a 
non-destructive operation; that is, the contents of the source operand are unaffected by the operation, 
unless the destination and source are the same general-purpose register. 

In 64-bit mode, the operand size is determined by the value of VEX. W. If VEX. W is 1, the operand 
size is 64 bits; if VEX. W is 0, the operand size is 32 bits. In 32-bit mode, VEX.W is ignored. 16-bit 
operands are not supported. 

The destination (dest) is a general-purpose register and the first source (src) is either a general-purpose 
register or a memory operand. The second source operand shft_cnt is a general-purpose register. When 
the operand size is 32, bits [31:5] of shft_cnt are ignored; when the operand size is 64, bits [63:6] of 
shft_cnt are ignored. 

This instruction is a BMI2 instruction. Support for this instruction is indicated by CPUID 
Fn0000_0007_EBX_x0[BMI2] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



VEX 

RXB.mapselect 

W.vvvv.L.pp 

Opcode 

SHRX reg32, reg/mem32, reg32 

C4 

RXB.02 

0.src2.0.11 

F7 /r 

SHRX reg64, reg/mem64, reg64 

C4 

RXB.02 

1.src2.0.11 

F7 /r 


Related Instructions 

RORX, SARX, SHLX 

rFLAGS Affected 

None. 
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Exceptions 


Exception 

Mode 

Cause of Exception 

Real 

Virtual 

8086 

Protected 

Invalid opcode, #UD 

X 

X 


BMI2 instructions are only recognized in protected mode. 



X 

BMI2 instructions are not supported, as indicated by 
CPUID Fn0000_0007_EBX_x0[BMI2] = 0. 



X 

VEX.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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SLWPCB Store Lightweight Profiling Control Block 

Address 

Flushes Lightweight Profiling (LWP) state to memory and returns the current effective address of the 
Lightweight Profiling Control Block (LWPCB) in the specified register. The LWPCB address returned 
is truncated to 32 bits if the operand size is 32. 

If LWP is not currently enabled, SLWPCB sets the specified register to zero. 

The flush operation stores the internal event counters for active events and the current ring buffer head 
pointer into the LWPCB. If there is an unwritten event record pending, it is written to the event ring 
buffer. 

The LWPCBADDR MSR holds the linear address of the current LWPCB. If the contents of 
LWPCBADDR is not zero, the value returned in the specified register is an effective address that is 
calculated by subtracting the current DS.Base address from the linear address kept in LWP CBADDR. 
Note that if DS has changed between the time LLWPCB was executed and the time SLWPCB is 
executed, this might result in an address that is not currently accessible by the application. 

SLWPCB generates an invalid opcode exception (#UD) if the machine is not in protected mode or if 
LWP is not available. 

It is possible to execute SLWPCB when the CPL != 3 or when SMM is active, but if the LWPCB 
pointer is not zero, system software must ensure that the LWPCB and the entire ring buffer are 
properly mapped into writable memory in order to avoid a #PF fault. Using SLWPCB in these 
situations is not recommended. 

See the discussion of lightweight profiling in Volume 2, Chapter 13 for more information on the use of 
the LLWPCB, SLWPCB, LWPINS, and LWPVAL instructions. 

The SLWPCB instruction is implemented if LWP is supported on a processor. Support for LWP is 
indicated by CPUID Fn8000_0001_ECX[LWP] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 

Instruction Encoding 

Mnemonic Encoding 



XOP 

RXB.map_select 

W.vvvv.L.pp 

Opcode 

SLWPCB reg32 

8F 

RXB.09 

0.1111.0.00 

12/1 

SLWPCB reg64 

8F 

RXB.09 

1.1111.0.00 

12/1 


ModRM.reg augments the opcode and is assigned the value 001b. ModRM.r/m (augmented by 
XOPR) specifies the register in which to put the LWPCB address. ModRM.mod must be 1 lb. 
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Related Instructions 

LLWPCB, LWPINS, LWPVAL 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

The SLWPCB instruction is not supported, as indicated by 
CPUID Fn8000_0001_ECX[LWP] = 0. 

X 

X 


The system is not in protected mode. 



X 

LWP is not available, or mod != 11b, or vvvv != 1111b. 

Page fault, #PF 



X 

A page fault resulted from reading or writing the LWPCB. 



X 

A page fault resulted from flushing an event to the ring buffer. 
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STC Set Carry Flag 

Sets the carry flag (CF) in the rFLAGS register to one. 

Mnemonic Opcode Description 

STC F9 Set the carry flag (CF) to one. 

Related Instructions 
CLC, CMC 
rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 

















1 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 

None 
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STD Set Direction Flag 

Set the direction flag (DF) in the rFLAGS register to 1. If the DF flag is 0, each iteration of a string 
instruction increments the data pointer (index registers rSI or rDI). If the DF flag is 1, the string 
instruction decrements the pointer. Use the CLD instruction before a string instruction to make the 
data pointer increment. 


Mnemonic Opcode Description 

STD FD Set the direction flag (DF) to one. 


Related Instructions 

CLD, INSx, LODSx, MOVSx, OUTSx, SCASx, STOSx, CMPSx 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 










1 








21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 

None 
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STOS Store String 

STOSB 

STOSW 

STOSD 

STOSQ 

Copies a byte, word, doubleword, or quadword from the AL, AX, EAX, or RAX registers to the 
memory location pointed to by ES:rDI and increments or decrements the rDI register according to the 
state of the DF flag in the rFLAGS register. 

If the DF flag is 0, the instruction increments the pointer; otherwise, it decrements the pointer. It 
increments or decrements the pointer by 1,2, 4, or 8, depending on the size of the value being copied. 

The forms of the STOSx instruction with an explicit operand use the operand only to specify the type 
(size) of the value being copied. 

The no-operands forms specify the type (size) of the value being copied with the mnemonic. 

The STOSx instructions support the REP prefixes. For details about the REP prefixes, see “Repeat 
Prefixes” on page 12. The STOSx instructions can also operate inside a LOOPcc instruction. 


Mnemonic 

Opcode 

Description 

STOS mem8 

AA 

Store the contents of the AL register to ES:rDI, and then 
increment or decrement rDI. 

STOS mem16 

AB 

Store the contents of the AX register to ES:rDI, and then 
increment or decrement rDI. 

STOS mem32 

AB 

Store the contents of the EAX register to ES:rDI, and 
then increment or decrement rDI. 

STOS mem64 

AB 

Store the contents of the RAX register to ES:rDI, and 
then increment or decrement rDI. 

STOSB 

AA 

Store the contents of the AL register to ES:rDI, and then 
increment or decrement rDI. 

STOSW 

AB 

Store the contents of the AX register to ES:rDI, and then 
increment or decrement rDI. 

STOSD 

AB 

Store the contents of the EAX register to ES:rDI, and 
then increment or decrement rDI. 

STOSQ 

AB 

Store the contents of the RAX register to ES:rDI, and 
then increment or decrement rDI. 


Related Instructions 

LODSx, MOVSx 
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rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded the ES segment limit or was 
non-canonical. 



X 

The ES segment was a non-writable segment. 



X 

A null ES segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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SUB Subtract 

Subtracts an immediate value or the value in a register or memory location (second operand) from a 
register or a memory location (first operand) and stores the result in the first operand location. An 
immediate value is sign-extended to the length of the first operand. 

This instruction evaluates the result for both signed and unsigned data types and sets the OF and CF 
flags to indicate a borrow in a signed or unsigned result, respectively. It sets the SF flag to indicate the 
sign of a signed result. 

The forms of the SUB instruction that write to memory support the LOCK prefix. For details about the 
LOCK prefix, see “Lock Prefix” on page 11. 


Mnemonic 

Opcode 

Description 

SUB AL, imm8 

2C ib 

Subtract an immediate 8-bit value from the AL register 
and store the result in AL. 

SUB AX, imm16 

2D iw 

Subtract an immediate 16-bit value from the AX register 
and store the result in AX. 

SUB EAX, imm32 

2D id 

Subtract an immediate 32-bit value from the EAX 
register and store the result in EAX. 

SUB RAX, imm32 

2D id 

Subtract a sign-extended immediate 32-bit value from 
the RAX register and store the result in RAX. 

SUB reg/mem8, imm8 

80 15 ib 

Subtract an immediate 8-bit value from an 8-bit 
destination register or memory location. 

SUB reg/mem16, imm16 

81 15 iw 

Subtract an immediate 16-bit value from a 16-bit 
destination register or memory location. 

SUB reg/mem32 , imm32 

81 15 id 

Subtract an immediate 32-bit value from a 32-bit 
destination register or memory location. 

SUB reg/mem64, imm32 

81 15 id 

Subtract a sign-extended immediate 32-bit value from a 
64-bit destination register or memory location. 

SUB reg/mem16, imm8 

83 15 ib 

Subtract a sign-extended immediate 8-bit value from a 
16-bit register or memory location. 

SUB reg/mem32, imm8 

83 15 ib 

Subtract a sign-extended immediate 8-bit value from a 
32-bit register or memory location. 

SUB reg/mem64, imm8 

83 15 ib 

Subtract a sign-extended immediate 8-bit value from a 
64-bit register or memory location. 

SUB reg/mem8, reg8 

28/r 

Subtract the contents of an 8-bit register from an 8-bit 
destination register or memory location. 

SUB reg/mem16, reg16 

29/r 

Subtract the contents of a 16-bit register from a 16-bit 
destination register or memory location. 

SUB reg/mem32, reg32 

29 /r 

Subtract the contents of a 32-bit register from a 32-bit 
destination register or memory location. 

SUB reg/mem64, reg64 

29 /r 

Subtract the contents of a 64-bit register from a 64-bit 
destination register or memory location. 
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Mnemonic 

Opcode 

Description 

SUB reg8, reg/mem8 

2A/r 

Subtract the contents of an 8-bit register or memory 
operand from an 8-bit destination register. 

SUB reg16, reg/mem16 

2B/r 

Subtract the contents of a 16-bit register or memory 
operand from a 16-bit destination register. 

SUB reg32, reg/mem32 

2B/r 

Subtract the contents of a 32-bit register or memory 
operand from a 32-bit destination register. 

SUB reg64, reg/mem64 

2B /r 

Subtract the contents of a 64-bit register or memory 
operand from a 64-bit destination register. 


Related Instructions 

ADC, ADD, SBB 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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T1MSKC Inverse Mask From Trailing Ones 

Finds the least significant zero bit in the source operand, clears all bits below that bit to 0, sets all other 
bits to 1 (including the found bit) and writes the result to the destination. If the least significant bit of 
the source operand is 0, the destination is written with all ones. 

This instruction has two operands: 

T1MSKC dest, src 

In 64-bit mode, the operand size is detennined by the value of XOP.W. If XOP.W is 1, the operand size 
is 64-bit; if XOP.W is 0, the operand size is 32-bit. In 32-bit mode, XOP.W is ignored. 16-bit operands 
are not supported. 

The destination (dest) is a general purpose register. 

The source operand (src) is a general purpose register or a memory operand. 

The T1MSKC instruction effectively performs a bit-wise logical or of the inverse of the source 
operand and the result of incrementing the source operand by 1 and stores the result to the destination 
register: 

add tmpl, src, 1 
not tmp2, src 
or dest, tmpl, tmp2 

The value of the carry flag of rFLAGs is generated by the add pseudo-instruction and the remaining 
arithmetic flags are generated by the or pseudo-instruction. 

The T1MSKC instruction is a TBM instruction. Support for this instruction is indicated by CPUID 
Fn8000_0001_ECX[TBM] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



XOP 

RXB.map_select 

W.vvvv.L.pp 

Opcode 

T1MSKC reg32, reg/mem32 

8F 

RXB.09 

O.dest.O.OO 

01 17 

T1MSKC reg64, reg/mem64 

8F 

RXB.09 

1 .dest.0.00 

01 17 


Related Instructions 

ANDN, BEXTR, BLCFILL, BLCI, BLCIC, BLCMSK, BLCS, BLSFILL, BLSI, BLSIC, BLSR, 
BLSMSK, BSF, BSR, LZCNT, POPCNT, TZMSK, TZCNT 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13 12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 


TBM instructions are only recognized in protected mode. 



X 

TBM instructions are not supported, as indicated by 

CPUID Fn8000_0001_ECX[TBM] = 0. 



X 

XOPL is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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TEST Test Bits 

Performs a bit-wise logical and on the value in a register or memory location (first operand) with an 
immediate value or the value in a register (second operand) and sets the flags in the rFLAGS register 
based on the result. 

This instruction has two operands: 

TEST dest, src 

While the AND instruction changes the contents of the destination and the flag bits, the TEST 
instruction changes only the flag bits. 


Mnemonic 

Opcode 

Description 

TESTAL, imm8 

A8 ib 

and an immediate 8-bit value with the contents of the AL 
register and set rFLAGS to reflect the result. 

TEST AX, imm16 

A9 iw 

and an immediate 16-bit value with the contents of the AX 
register and set rFLAGS to reflect the result. 

TEST EAX, imm32 

A9 id 

and an immediate 32-bit value with the contents of the EAX 
register and set rFLAGS to reflect the result. 

TEST RAX, imm32 

A9 id 

and a sign-extended immediate 32-bit value with the contents 
of the RAX register and set rFLAGS to reflect the result. 

TEST reg/mem8, imm8 

F6 10 ib 

and an immediate 8-bit value with the contents of an 8-bit 
register or memory operand and set rFLAGS to reflect the result. 

TEST reg/mem16, imm16 

F7 10 iw 

and an immediate 16-bit value with the contents of a 16-bit 
register or memory operand and set rFLAGS to reflect the result. 

TEST reg/mem32, imm32 

F7 10 id 

and an immediate 32-bit value with the contents of a 32-bit 
register or memory operand and set rFLAGS to reflect the result. 

TEST reg/mem64, imm32 

F7 10 id 

and a sign-extended immediate32-bit value with the contents of 
a 64-bit register or memory operand and set rFLAGS to reflect 
the result. 

TEST reg/mem8, reg8 

84 /r 

and the contents of an 8-bit register with the contents of an 8-bit 
register or memory operand and set rFLAGS to reflect the result. 

TEST reg/mem16, reg16 

85 /r 

and the contents of a 16-bit register with the contents of a 16-bit 
register or memory operand and set rFLAGS to reflect the result. 

TEST reg/mem32, reg32 

85 /r 

and the contents of a 32-bit register with the contents of a 32-bit 
register or memory operand and set rFLAGS to reflect the result. 

TEST reg/mem64, reg64 

85 /r 

and the contents of a 64-bit register with the contents of a 64-bit 
register or memory operand and set rFLAGS to reflect the result. 

Related Instructions 



AND, CMP 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

M 

0 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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TZCNT Count Trailing Zeros 

Counts the number of trailing zero bits in the 16-, 32-, or 64-bit general purpose register or memory 
source operand. Counting starts upward from the least significant bit and stops when the lowest bit 
having a value of 1 is encountered or when the most significant bit is encountered. The count is written 
to the destination register. 

If the input operand is zero, CF is set to 1 and the size (in bits) of the input operand is written to the 
destination register. Otherwise, CF is cleared. 

If the least significant bit is a one, the ZF flag is set to 1 and zero is written to the destination register. 
Otherwise, ZF is cleared. 

TZCNT is a BMI instruction. Support for BMI instructions is indicated by CPUID 
Fn0000_0007_EBX_x0[BMI] = 1. If the TZCNT instruction is not available, the encoding is treated 
as the BSF instruction. Software must check the CPUID bit once per program or library initialization 
before using the TZCNT instruction or inconsistent behavior may result. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic 

Opcode 

Description 


TZCNT 

reg16, reg/mem16 

F3 OF BC It 

Count the number of trailing zeros in 

reg/mem16 

TZCNT 

reg32, reg/mem32 

F3 OF BC It 

Count the number of trailing zeros in 

reg/mem32 

TZCNT 

reg64, reg/mem64 

F3 OF BC It 

Count the number of trailing zeros in 

reg/mem64 


Related Instructions 

ANDN, BEXTR, BLCI, BLCIC, BLCMSK, BLCS, BLSFILL, BLSI, BLSIC, BLSR, BLSMSK, BSF, 
BSR, LZCNT, POPCNT, T1MSKC, TZMSK 


rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









U 




U 

M 

U 

U 


21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 
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Exceptions 


Exception 

Mode 

Cause of Exception 

Real 

Virtual 

8086 

Protected 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 

X 

X 

X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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TZMSK Mask From Trailing Zeros 

Finds the least significant one bit in the source operand, sets all bits below that bit to 1, clears all other 
bits to 0 (including the found bit) and writes the result to the destination. If the least significant bit of 
the source operand is 1, the destination is written with all zeros. 

This instruction has two operands: 

TZMSK dest, src 

In 64-bit mode, the operand size is detennined by the value of XOP.W. If XOP.W is 1, the operand size 
is 64-bit; if XOP.W is 0, the operand size is 32-bit. In 32-bit mode, XOP.W is ignored. 16-bit operands 
are not supported. 

The destination (dest) is a general purpose register. 

The source operand (src) is a general purpose register or a memory operand. 

The TZMSK instruction effectively performs a bit-wise logical and of the negation of the source 
operand and the result of subtracting 1 from the source operand, and stores the result to the destination 
register: 

sub tmpl, src, 1 

not tmp2, src 

and dest, tmpl, tmp2 

The value of the carry flag of rFLAGs is generated by the sub pseudo-instruction and the remaining 
arithmetic flags are generated by the and pseudo-instruction. 

The TZMSK instruction is a TBM instruction. Support for this instruction is indicated by CPUID 
Fn8000_0001_ECX[TBM] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic Encoding 



XOP 

RXB.mapselect 

W.vvvv.L.pp 

Opcode 

TZMSK reg32, reg/mem32 

8F 

RXB.09 

0.dest.0.00 

01 14 

TZMSK reg64, reg/mem64 

8F 

RXB.09 

1.dest.0.00 

01 14 


Related Instructions 

ANDN, BEXTR, BLCFILL, BLCI, BLCIC, BLCMSK, BLCS, BLSFILL, BLSI, BLSIC, BLSR, 
BLSMSK, BSF, BSR, LZCNT, POPCNT, T1MSKC, TZCNT 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

U 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 


TBM instructions are only recognized in protected mode. 



X 

TBM instructions are not supported, as indicated by 

CPUID Fn8000_0001_ECX[TBM] = 0. 



X 

XOP.L is 1. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or 
was non-canonical. 

General protection, #GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 


346 


General-Purpose Instruction Reference 








24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


UDO, UD1, UD2 Undefined Operation 

These opcodes generate an invalid opcode exception. Unlike other undefined opcodes that may be 
defined as legal instructions in the future, these opcodes are intended to stay undefined. On some 
AMD64 processor implementations, UD1 may report an invalid opcode exception regardless of 
whether fetching the modrtn byte could trigger a paging or segmentation exception. 


Mnemonic 

Opcode 

Description 

UDO 

OF FF 

Raise an invalid opcode exception 

UD1 

OF B9/r 

Raise an invalid opcode exception 

UD2 

OF OB 

Raise an invalid opcode exception 


Related Instructions 

None 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

This instruction is not recognized. 
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WRFSBASE Write FS.base 

WRGSBASE Write GS.base 

Writes the base field of the FS or GS segment descriptor with the value contained in the register 
operand. When supported and enabled, these instructions can be executed at any processor privilege 
level. Instructions are only defined in 64-bit mode. The address written to the base field must be in 
canonical form or a #GP fault will occur. 

System software must set the FSGSBASE bit (bit 16) of CR4 to enable the WRFSBASE and 
WRGSBASE instructions. 

Support for this instruction is indicated by CPUID Fn0000_0007_EBX_x0[FSGSBASE] = 1. 

For more information on using the CPUID instruction, see the instruction reference page for the 
CPUID instruction on page 160. For a description of all feature flags related to instruction subset 
support, see Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. 


Mnemonic 

Opcode 

WRFSBASE reg32 

F3 OF AE 12 

WRFSBASE reg64 

F3 OF AE 12 

WRGSBASE reg32 

F3 OF AE 13 

WRGSBASE reg64 

F3 OF AE 13 


Description 

Copy the contents of the specified 32-bit general' 
purpose register to the lower 32 bits of FS.base. 

Copy the contents of the specified 64-bit general' 
purpose register to FS.base. 

Copy the contents of the specified 32-bit general' 
purpose register to the lower 32 bits of GS.base. 

Copy the contents of the specified 64-bit general' 
purpose register to GS.base. 


Related Instructions 

RDFSBASE, RDGSBASE 

rFLAGS Affected 

None. 

Exceptions 


Exception 

Legacy 

Compat¬ 

ibility 

64-bit 

Cause of Exception 

#UD 

X 

X 


Instruction is not valid in compatibility or legacy 
modes. 



X 

Instruction not supported as indicated by CPUID 
Fn0000_0007_EBX_x0[FSGSBASE] = 0 or, if 
supported, not enabled in CR4. 

#GP 



X 

Attempt to write non-canonical address to segment 
base address. 
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XADD Exchange and Add 

Exchanges the contents of a register (second operand) with the contents of a register or memory 
location (first operand), computes the sum of the two values, and stores the result in the first operand 
location. 

The forms of the XADD instruction that write to memory support the LOCK prefix. For details about 
the LOCK prefix, see “Lock Prefix” on page 11. 


Mnemonic Opcode 

XADD reg/mem8, reg8 OF CO /r 

XADD reg/mem 16, reg 16 OF Cl /r 

XADD reg/mem32, reg32 OF Cl /r 

XADD reg/mem64, reg64 OF Cl /r 


Related Instructions 


Description 

Exchange the contents of an 8-bit register with the 
contents of an 8-bit destination register or memory 
operand and load their sum into the destination. 

Exchange the contents of a 16-bit register with the 
contents of a 16-bit destination register or memory 
operand and load their sum into the destination. 

Exchange the contents of a 32-bit register with the 
contents of a 32-bit destination register or memory 
operand and load their sum into the destination. 

Exchange the contents of a 64-bit register with the 
contents of a 64-bit destination register or memory 
operand and load their sum into the destination. 


None 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









M 




M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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XCHG Exchange 

Exchanges the contents of the two operands. The operands can be two general-purpose registers or a 
register and a memory location. If either operand references memory, the processor locks 
automatically, whether or not the LOCK prefix is used and independently of the value of IOPL. For 
details about the LOCK prefix, see “Lock Prefix” on page 11. 

The x86 architecture commonly uses the XCHG EAX, EAX instruction (opcode 90h) as a one-byte 
NOP. In 64-bit mode, the processor treats opcode 90h as a true NOP only if it would exchange rAX 
with itself. Without this special handling, the instruction would zero-extend the upper 32 bits of RAX, 
and thus it would not be a true no-operation. Opcode 90h can still be used to exchange rAX and r8 if 
the appropriate REX prefix is used. 

This special handling does not apply to the two-byte ModRM fonn of the XCHG instruction. 


Mnemonic 

Opcode 

Description 

XCHG AX, reg16 

90 +rw 

Exchange the contents of the AX register with the 
contents of a 16-bit register. 

XCHG reg16, AX 

90 +rw 

Exchange the contents of a 16-bit register with the 
contents of the AX register. 

XCHG EAX, reg32 

90 +rd 

Exchange the contents of the EAX register with the 
contents of a 32-bit register. 

XCHG reg32 , EAX 

90 +rd 

Exchange the contents of a 32-bit register with the 
contents of the EAX register. 

XCHG RAX, reg64 

90 +rq 

Exchange the contents of the RAX register with the 
contents of a 64-bit register. 

XCHG reg64, RAX 

90 +rq 

Exchange the contents of a 64-bit register with the 
contents of the RAX register. 

XCHG reg/mem8, reg8 

86 /r 

Exchange the contents of an 8-bit register with the 
contents of an 8-bit register or memory operand. 

XCHG reg8, reg/mem8 

86 /r 

Exchange the contents of an 8-bit register or memory 
operand with the contents of an 8-bit register. 

XCHG reg/mem16, reg16 

87 /r 

Exchange the contents of a 16-bit register with the 
contents of a 16-bit register or memory operand. 

XCHG reg16 , reg/mem16 

87/r 

Exchange the contents of a 16-bit register or memory 
operand with the contents of a 16-bit register. 

XCHG reg/mem32, reg32 

87/r 

Exchange the contents of a 32-bit register with the 
contents of a 32-bit register or memory operand. 

XCHG reg32, reg/mem32 

87/r 

Exchange the contents of a 32-bit register or memory 
operand with the contents of a 32-bit register. 

XCHG reg/mem64, reg64 

87 /r 

Exchange the contents of a 64-bit register with the 
contents of a 64-bit register or memory operand. 

XCHG reg64, reg/mem64 

87 /r 

Exchange the contents of a 64-bit register or memory 
operand with the contents of a 64-bit register. 
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Related Instructions 

BSWAP, XADD 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The source or destination operand was in a non-writable 
segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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XLAT Translate Table Index 

XLATB 

Uses the unsigned integer in the AL register as an offset into a table and copies the contents of the table 
entry at that location to the AL register. 

The instruction uses seg:[rBX] as the base address of the table. The value of seg defaults to the DS 
segment, but may be overridden by a segment prefix. 

This instruction writes AL without changing RAX[63:8]. This instruction ignores operand size. 

The single-operand form of the XLAT instruction uses the operand to document the segment and 
address size attribute, but it uses the base address specified by the rBX register. 

This instruction is often used to translate data from one format (such as ASCII) to another (such as 
EBCDIC). 

Mnemonic Opcode 

XLAT mem8 D7 

XLATB D7 

Related Instructions 

None 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 


Description 

Set AL to the contents of DS:[rBX + unsigned AL]. 
Set AL to the contents of DS:[rBX + unsigned AL]. 
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XOR Logical Exclusive OR 

Performs a bit-wise logical xor operation on both operands and stores the result in the first operand 
location. The first operand can be a register or memory location. The second operand can be an 
immediate value, a register, or a memory location. XOR-ing a register with itself clears the register. 

The forms of the XOR instruction that write to memory support the LOCK prefix. For details about the 
LOCK prefix, see “Lock Prefix” on page 11. 

The instruction performs the following operation for each bit: 


X 

Y 

X xor Y 

0 

0 

0 

0 

1 

1 

1 

0 

1 

1 

1 

0 


Mnemonic 

Opcode 

Description 

XOR AL, imm8 

34 ib 

xor the contents of AL with an immediate 8-bit 
operand and store the result in AL. 

XOR AX, imm16 

35 iw 

xor the contents of AX with an immediate 16-bit 
operand and store the result in AX. 

XOR EAX, imm32 

35 id 

xor the contents of EAX with an immediate 32-bit 
operand and store the result in EAX. 

XOR RAX, imm32 

35 id 

xor the contents of RAX with a sign-extended 
immediate 32-bit operand and store the result in RAX. 

XOR reg/mem8, imm8 

80 16 ib 

xor the contents of an 8-bit destination register or 
memory operand with an 8-bit immediate value and 
store the result in the destination. 

XOR reg/mem16, imm16 

81 /6 iw 

xor the contents of a 16-bit destination register or 
memory operand with a 16-bit immediate value and 
store the result in the destination. 

XOR reg/mem32, imm32 

81 16 id 

xor the contents of a 32-bit destination register or 
memory operand with a 32-bit immediate value and 
store the result in the destination. 

XOR reg/mem64, imm32 

81 16 id 

xor the contents of a 64-bit destination register or 
memory operand with a sign-extended 32-bit immediate 
value and store the result in the destination. 

XOR reg/mem16, imm8 

83 16 ib 

xor the contents of a 16-bit destination register or 
memory operand with a sign-extended 8-bit immediate 
value and store the result in the destination. 
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Mnemonic 

Opcode 

Description 

XOR reg/mem32, imm8 

83 /6 ib 

xor the contents of a 32-bit destination register or 
memory operand with a sign-extended 8-bit immediate 
value and store the result in the destination. 

XOR reg/mem64, imm8 

83 16 ib 

xor the contents of a 64-bit destination register or 
memory operand with a sign-extended 8-bit immediate 
value and store the result in the destination. 

XOR reg/mem8, reg8 

30 /r 

xor the contents of an 8-bit destination register or 
memory operand with the contents of an 8-bit register 
and store the result in the destination. 

XOR reg/mem16, reg16 

31 /r 

xor the contents of a 16-bit destination register or 
memory operand with the contents of a 16-bit register 
and store the result in the destination. 

XOR reg/mem32, reg32 

31 /r 

xor the contents of a 32-bit destination register or 
memory operand with the contents of a 32-bit register 
and store the result in the destination. 

XOR reg/mem64, reg64 

31 /r 

xor the contents of a 64-bit destination register or 
memory operand with the contents of a 64-bit register 
and store the result in the destination. 

XOR reg8, reg/mem8 

32/r 

xor the contents of an 8-bit destination register with 
the contents of an 8-bit register or memory operand and 
store the results in the destination. 

XOR reg 1 6, reg/mem 16 

33/r 

xor the contents of a 16-bit destination register with 
the contents of a 16-bit register or memory operand and 
store the results in the destination. 

XOR reg32, reg/mem32 

33 /r 

xor the contents of a 32-bit destination register with 
the contents of a 32-bit register or memory operand and 
store the results in the destination. 

XOR reg64, reg/mem64 

33/r 

xor the contents of a 64-bit destination register with 
the contents of a 64-bit register or memory operand and 


store the results in the destination. 

Related Instructions 

OR, AND, NOT, NEG 


rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 









0 




M 

M 

U 

M 

0 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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4 System Instruction Reference 


This chapter describes the function, mnemonic syntax, opcodes, affected flags, and possible 
exceptions generated by the system instructions. System instructions are used to establish the 
processor operating mode, access processor resources, handle program and system errors, manage 
memory, and instantiate a virtual machine. Most of these instructions can only be executed by 
privileged software, such as the operating system or a Virtual Machine Monitor (VMM), also known 
as a hypervisor. Only system instructions can access certain processor resources, such as the control 
registers, model-specific registers, and debug registers. 

Most system instructions are supported in all hardware implementations of the AMD64 architecture. 
The table below lists instructions that may not be supported on a given processor implementation. 
System software must execute the CPUID instruction using the function number listed to determine 
support prior to using these instructions. 


Table 4-1. System Instruction Support Indicated by CPUID Feature Bits 


Instruction 

CPUID Feature Bit 

Register[Bit] 

Long Mode and Long Mode 
instructions 

CPUID Fn8000_0001_EDX[LM] 

EDX[29] 

MONITOR, MWAIT 

CPUID FnOOOO_OOC)1_ECX[MONITOR] 

ECX[3] 

MONITORX, MWAITX 

CPUID CPUID 8000_0001_ECX[MONITORX] 

ECX [29] 

RDMSR, WRMSR 

CPUID Fn0000_0001_EDX[MSR] 

EDX[5] 

RDTSCP 

CPUID Fn8000_0001_EDX[RDTSCP] 

EDX[27] 

SKINIT, STGI 

CPUID Fn8000_0001_ECX[SKINIT] 

ECX[12] 

SVM Architecture and 
instructions 

CPUID Fn8000_0001_ECX[SVM] 

ECX[2] 

SYSCALL, SYSRET 

CPUID Fn8000_0001_EDX[SysCallSysRet] 

EDX[11] 

SYSENTER, SYSEXIT 

CPUID Fn0000_0001_EDX[SysEnterSysExit] 

EDX[11] 

WBNOINVD 

CPUID Fn8000_0008_EBX[WBNOINVD] 

EBX[9] 


There are also several other CPUID feature bits that indicate support for certain paging functions, 
virtual-mode extensions, machine-check exceptions, advanced programmable interrupt control 
(APIC), memory-type range registers (MTRRs), etc. 

For more information on using the CPUID instruction, see the reference page for the CPUID 
instruction on page 160. For a comprehensive list of all instruction support feature flags, see 
Appendix D, “Instruction Subsets and CPUID Feature Flags,” on page 537. For a comprehensive list 
of all defined CPUID feature numbers and return values, see Appendix E, “Obtaining Processor 
Information Via the CPUID Instruction,” on page 607. 

For further information about the system instructions and register resources, see: 

• “System-Management Instructions” in Volume 2. 
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• “Summary of Registers and Data Types” on page 38. 

• “Notation” on page 52. 

• “Instruction Prefixes” on page 5. 
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ARPL Adjust Requestor Privilege Level 

Compares the requestor privilege level (RPL) fields of two segment selectors in the source and 
destination operands of the instruction. If the RPL field of the destination operand is less than the RPL 
field of the segment selector in the source register, then the zero flag is set and the RPL field of the 
destination operand is increased to match that of the source operand. Otherwise, the destination 
operand remains unchanged and the zero flag is cleared. 

The destination operand can be either a 16-bit register or memory location; the source operand must be 
a 16-bit register. 

The ARPL instruction is intended for use by operating-system procedures to adjust the RPL of a 
segment selector that has been passed to the operating system by an application program to match the 
privilege level of the application program. The segment selector passed to the operating system is 
placed in the destination operand and the segment selector for the code segment of the application 
program is placed in the source operand. The RPL field in the source operand represents the privilege 
level of the application program. The ARPL instruction then insures that the RPL of the segment 
selector received by the operating system is no lower than the privilege level of the application 
program. 

See “Adjusting Access Rights” in Volume 2, for more information on access rights. 

In 64-bit mode, this opcode (63H) is used for the MOVSXD instruction. 


Mnemonic 


Opcode 


ARPL reg/mem16, reg16 63 /r 


Description 

Adjust the RPL of a destination segment selector to 
a level not less than the RPL of the segment 
selector specified in the 16-bit source register. 
(Invalid in 64-bit mode.) 


Related Instructions 

LAR, LSL, VERR, VERW 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 














M 




21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to one or cleared to zero is M (modified). Unaffected flags 
are blank. Undefined flags are U. 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, #UD 

X 

X 


This instruction is only recognized in protected legacy and 
compatibility mode. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit. 

General protection, 
#GP 



X 

A memory address exceeded a data segment limit. 



X 

The destination operand was in a non-writable segment. 



X 

A null segment selector was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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CLAC Clear Alignment Check Flag 

Sets the Alignment Check flag in the rFLAGS register to zero. Support for the CLAC instruction is 
indicated by CPUID Fn07_EBX[20] = 1. For more information on using the CPUID instruction, see 
the description of the CPUID instruction on page 160. 


Description 

Clear AC Flag 

Related Instructions 

STAC 

rFLAGS Affected 


Mnemonic Opcode 

CLAC OF 01 CA 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 




0 














21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are 
blank. Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

Instruction not supported by CPUID 


X 

X 

Instruction is not supported in virtual mode 

X 


X 

Lock prefix (FOh) preceding opcode. 



X 

CPL was not 0 
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CLGI Clear Global Interrupt Flag 

Clears the global interrupt flag (GIF). While GIF is zero, all external interrupts are disabled. 

This is a Secure Virtual Machine instruction. Support for the SVM architecture and the SVM 
instructions is indicated by CPUID Fn8000_0001_ECX[SVM] = 1. For more information on using the 
CPUID instruction, see the reference page for the CPUID instruction on page 160. 

This instruction generates a #UD exception if SVM is not enabled. See “Enabling SVM” in AMD64 
Architecture Programmer s Manual Volume-2: System Instructions, order# 24593. 


Mnemonic Opcode 

CLGI OF 01 DD 

Related Instructions 

STGI 

rFLAGS Affected 

None. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

The SVM instructions are not supported as indicated by 
CPUID Fn8000_0001_ECX[SVM] = 0. 



X 

Secure Virtual Machine was not enabled (EFER.SVME=0). 

X 

X 


Instruction is only recognized in protected mode. 

General protection, 
#GP 



X 

CPL was not zero. 


Description 

Clears the global interrupt flag (GIF). 
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CLI Clear Interrupt Flag 

Clears the interrupt flag (IF) in the rFLAGS register to zero, thereby masking external interrupts 
received on the INTR input. Interrupts received on the non-maskable interrupt (NMI) input are not 
affected by this instruction. 

In real mode, this instruction clears IF to 0. 

In protected mode and virtual-8086-mode, this instruction is IOPL-sensitive. If the CPL is less than or 
equal to the rFLAGS.IOPL field, the instruction clears IF to 0. 

In protected mode, if IOPL < 3, CPL = 3, and protected mode virtual interrupts are enabled (CR4.PVI 
= 1), then the instruction instead clears rFLAGS. VIF to 0. If none of these conditions apply, the 
processor raises a general-purpose exception (#GP). For more information, see “Protected Mode 
Virtual Interrupts” in Volume 2. 

In virtual-8086 mode, if IOPL < 3 and the virtual-8086-mode extensions are enabled (CR4.VME = 1), 
the CLI instruction clears the virtual interrupt flag (rFLAGS.VIF) to 0 instead. 

See “Virtual-8086 Mode Extensions” in Volume 2 for more information about IOPL-sensitive 
instructions. 


Mnemonic Opcode Description 

CLI FA Clear the interrupt flag (IF) to zero. 

Action 

IF (CPL <= IOPL) 

RFLAGS.IF = 0 

ELSEIF (((VIRTUAL_MODE) && (CR4.VME == 1)) 

| ((PROTECTED_MODE) && (CR4.PVI == 1) && (CPL == 3))) 

RFLAGS.VIF = 0; 


ELSE 

EXCEPTION[#GP(0)] 

Related Instructions 

STI 
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rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 



M 








M 







21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to one or cleared to zero is M (modified). Unaffected flags are 
blank. Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 
#GP 


X 


The CPL was greater than the IOPL and virtual mode 
extensions are not enabled (CR4.VME = 0). 



X 

The CPL was greater than the IOPL and either the CPL was 
not 3 or protected mode virtual interrupts were not enabled 
(CR4.PVI = 0). 
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CLTS Clear Task-Switched Flag in CRO 

Clears the task-switched (TS) flag in the CRO register to 0. The processor sets the TS flag on each task 
switch. The CLTS instruction is intended to facilitate the synchronization of FPU context saves during 
multitasking operations. 

This instruction can only be used if the current privilege level is 0. 

See “System-Control Registers” in Volume 2 for more information on FPU synchronization and the 
TS flag. 

Mnemonic Opcode 

CLTS OF 06 

Related Instructions 

LMSW, MOV CRn 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

General protection, 
#GP 


X 

X 

CPL was not 0. 


Description 

Clear the task-switched (TS) flag in CRO to 0. 
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HLT Halt 

Causes the microprocessor to halt instruction execution and enter the HALT state. Entering the HALT 
state puts the processor in low-power mode. Execution resumes when an unmasked hardware interrupt 
(INTR), non-maskable interrupt (NMI), system management interrupt (SMI), RESET, or INIT occurs. 

If an INTR, NMI, or SMI is used to resume execution after a HLT instruction, the saved instruction 
pointer points to the instruction following the HLT instruction. 

Before executing a HLT instruction, hardware interrupts should be enabled. If rFLAGS.IF = 0, the 
system will remain in a HALT state until an NMI, SMI, RESET, or INIT occurs. 

If an SMI brings the processor out of the HALT state, the SMI handler can decide whether to return to 
the HALT state or not. See Volume 2: System Programming, for information on SMIs. 

Current privilege level must be 0 to execute this instruction. 


Mnemonic Opcode Description 

HLT F4 Halt instruction execution. 

Related Instructions 

STI, CLI 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 
#GP 


X 

X 

CPL was not 0. 
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INT 3 Interrupt to Debug Vector 

Calls the debug exception handler. This instruction maps to a 1-byte opcode (CC) that raises a #BP 
exception. The INT 3 instruction is normally used by debug software to set instruction breakpoints by 
replacing the first byte of the instruction opcode bytes with the INT 3 opcode. 

This one-byte INT 3 instruction behaves differently from the two-byte INT 3 instruction (opcode CD 
03) (see “INT” in Chapter 3 “General Purpose Instructions” for further infonnation) in two ways: 

The #BP exception is handled without any IOPL checking in virtual x86 mode. (IOPL mismatches 
will not trigger an exception.) 

• In VME mode, the #BP exception is not redirected via the interrupt redirection table. (Instead, it is 
handled by a protected mode handler.) 

Mnemonic Opcode Description 

INT 3 CC Trap to debugger at Interrupt 3. 

For complete descriptions of the steps perfonned by INT instructions, see the following: 

• Legacy-Mode Interrupts: “Legacy Protected-Mode Interrupt Control Transfers” in Volume 2. 

• Long-Mode Interrupts: “Long-Mode Interrupt Control Transfers” in Volume 2. 

Action 

// Refer to INT instruction's Action section for the details on INT N REAL, 

// INT_N_PROTECTED, and INT_N_VIRTUAL_TO_PROTECTED. 

INT3_START: 

If (REAL_MODE) 

INT_N_REAL //N = 3 

ELSEIF (PROTECTED_MODE) 

INT_N_PROTECTED //N = 3 

ELSE // VIRTUAL_MODE 

INT_N_VIRTUAL_TO_PROTECTED //N = 3 

Related Instructions 

INT, INTO, IRET 


System Instruction Reference 


INT 3 


367 



AMpg 

AMD64 Technology 


24594 — Rev. 3.28—September 2019 


rFLAGS Affected 

If a task switch occurs, all flags are modified; otherwise, setting are as follows: 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 




M 

0 

0 

M 




M 

0 






21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to one or cleared to zero is M (modified). Unaffected flags 
are blank. Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Breakpoint, #BP 

X 

X 

X 

INT 3 instruction was executed. 

Invalid TSS, #TS 
(selector) 


X 

X 

As part of a stack switch, the target stack segment selector or 
rSP in the TSS that was beyond the TSS limit. 


X 

X 

As part of a stack switch, the target stack segment selector in 
the TSS was beyond the limit of the GDT or LDT descriptor 
table. 


X 

X 

As part of a stack switch, the target stack segment selector in 
the TSS was a null selector. 


X 

X 

As part of a stack switch, the target stack segment selector’s 

Tl bit was set, but the LDT selector was a null selector. 


X 

X 

As part of a stack switch, the target stack segment selector in 
the TSS contained a RPL that was not equal to its DPL. 


X 

X 

As part of a stack switch, the target stack segment selector in 
the TSS contained a DPL that was not equal to the CPL of the 
code segment selector. 


X 

X 

As part of a stack switch, the target stack segment selector in 
the TSS was not a writable segment. 

Segment not 
present, #NP 
(selector) 


X 

X 

The accessed code segment, interrupt gate, trap gate, task 
gate, or TSS was not present. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

Stack, #SS 
(selector) 


X 

X 

After a stack switch, a memory address exceeded the stack 
segment limit or was non-canonical and a stack switch 
occurred. 


X 

X 

As part of a stack switch, the SS register was loaded with a 
non-null segment selector and the segment was marked not 
present. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded the data segment limit or was 
non-canonical. 

X 

X 

X 

The target offset exceeded the code segment limit or was non- 
canonical. 
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Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 
#GP 

(selector) 

X 

X 

X 

The interrupt vector was beyond the limit of IDT. 


X 

X 

The descriptor in the IDT was not an interrupt, trap, or task 
gate in legacy mode or not a 64-bit interrupt or trap gate in 
long mode. 


X 

X 

The DPL of the interrupt, trap, or task gate descriptor was less 
than the CPL. 


X 

X 

The segment selector specified by the interrupt or trap gate 
had its Tl bit set, but the LDT selector was a null selector. 


X 

X 

The segment descriptor specified by the interrupt or trap gate 
exceeded the descriptor table limit or was a null selector. 


X 

X 

The segment descriptor specified by the interrupt or trap gate 
was not a code segment in legacy mode, or not a 64-bit code 
segment in long mode. 



X 

The DPL of the segment specified by the interrupt or trap gate 
was greater than the CPL. 


X 


The DPL of the segment specified by the interrupt or trap gate 
pointed was not 0 or it was a conforming segment. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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INVD Invalidate Caches 

Invalidates all levels of cache associated with this processor. This may or may not include lower level 
caches associated with another processor that shares any level of this processor's cache hierarchy. 

No data is written back to main memory from invalidating the caches. 

CPUID FnSOOOOO I D_EDX[WBINVD]_x/V indicates the behavior of the processor at various levels 
of the cache hierarchy. If the feature bit is 0, the instruction causes the invalidation of all lower level 
caches of other processors sharing the designated level of cache. If the feature bit is 1, the instruction 
does not necessarily cause the invalidation of all lower level caches of other processors sharing the 
designated level of cache. See Appendix E, “Obtaining Processor Information Via the CPUID 
Instruction,” on page 607 for more infonnation on using the CPUID function. 

This is a privileged instruction. The current privilege level (CPL) of a procedure invalidating the 
processor’s internal caches must be 0. 

To insure that data is written back to memory prior to invalidating caches, use the WBINVD 
instruction. 

This instruction does not invalidate TLB caches. 

INVD is a serializing instruction. 


Mnemonic Opcode 

INVD OF 08 


Description 

Invalidate internal caches and trigger external cache 
invalidations. 


Related Instructions 

WBINVD, WBNOINVD, CLWB, CLFLUSH 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 
#GP 


X 

X 

CPL was not 0. 
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INVLPG Invalidate TLB Entry 

Invalidates the TLB entry that would be used for the 1-byte memory operand. 

This instruction invalidates the TLB entry, regardless of the G (Global) bit setting in the associated 
PDE or PTE entry and regardless of the page size (4 Kbytes, 2 Mbytes, 4 Mbytes, or 1 Gbyte). It may 
invalidate any number of additional TLB entries, in addition to the targeted entry. 

INVLPG is a serializing instruction and a privileged instruction. The current privilege level must be 0 
to execute this instruction. 

See “Page Translation and Protection” in Volume 2 for more information on page translation. 

Mnemonic Opcode 

INVLPG mem8 OF 01 17 

Related Instructions 

INVLPGA, MOV CR/7 (CR3 and CR4) 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 
#GP 


X 

X 

CPL was not 0. 


Description 

Invalidate the TLB entry for the page containing a specified 
memory location. 
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INVLPGA Invalidate TLB Entry in a Specified ASID 

Invalidates the TLB mapping for a given virtual page and a given ASID. The virtual (linear) address is 
specified in the implicit register operand rAX. The portion of RAX used to form the address is 
determined by the effective address size (current execution mode and optional address size prefix). 
The ASID is taken from ECX. 

The INVLPGA instruction may invalidate any number of additional TLB entries, in addition to the 
targeted entry. 

The INVLPGA instruction is a serializing instruction and a privileged instruction. The current 
privilege level must be 0 to execute this instruction. 

This is a Secure Virtual Machine (SVM) instruction. Support for the SVM architecture and the SVM 
instructions is indicated by CPUID Fn8000_0001_ECX[SVM] = 1. For more information on using the 
CPUID instruction, see the reference page for the CPUID instruction on page 160. 

This instruction generates a #UD exception if SVM is not enabled. See “Enabling SVM” in AMD64 
Architecture Programmer s Manual Volume-2: System Instructions, order# 24593. 

Mnemonic Opcode 

INVLPGA rAX, ECX OF 01 DF 

Related Instructions 

INVLPG. 

rFLAGS Affected 

None. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

The SVM instructions are not supported as indicated by 
CPUID Fn8000_0001_ECX[SVM] = 0. 



X 

Secure Virtual Machine was not enabled (EFER.SVME=0). 

X 

X 


Instruction is only recognized in protected mode. 

General protection, 
#GP 



X 

CPL was not zero. 


Description 

Invalidates the TLB mapping for the virtual page 
specified in rAX and the ASID specified in ECX. 
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IRET Return from Interrupt 

IRETD 

IRETQ 

Returns program control from an exception or interrupt handler to a program or procedure previously 
interrupted by an exception, an external interrupt, or a software-generated interrupt. These instructions 
also perform a return from a nested task. All flags, CS, and rIP are restored to the values they had 
before the interrupt so that execution may continue at the next instruction following the interrupt or 
exception. In 64-bit mode or if the CPL changes, SS and RSP are also restored. 

IRET, IRETD, and IRETQ are synonyms mapping to the same opcode. They are intended to provide 
semantically distinct forms for various opcode sizes. The IRET instruction is used for 16-bit operand 
size; IRETD is used for 32-bit operand sizes; IRETQ is used for 64-bit operands. The latter form is 
only meaningful in 64-bit mode. 

IRET, IRETD, or IRETQ must be used to terminate the exception or interrupt handler associated with 
the exception, external interrupt, or software-generated interrupt. 

IRETx is a serializing instruction. 

For detailed descriptions of the steps perfonned by IRETx instructions, see the following: 

• Legacy-Mode Interrupts: “Legacy Protected-Mode Interrupt Control Transfers” in Volume 2. 

• Long-Mode Interrupts: “Long-Mode Interrupt Control Transfers” in Volume 2. 


Mnemonic 

Opcode 

Description 

IRET 

CF 

Return from interrupt (16-bit operand size) 

IRETD 

CF 

Return from interrupt (32-bit operand size) 

IRETQ 

CF 

Return from interrupt (64-bit operand size) 

Action 




IRET_START: 

IF (REAL_MODE) 
IRET_REAL 

ELSIF (PROTECTED_MODE) 
IRET_PROTECTED 
ELSE // (VIRTUAL_MODE) 
IRET^VIRTUAL 

IRET_REAL: 

POP.v temp RIP 
POP.v temp CS 
POP.v temp_RFLAGS 
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IF (temp_RIP > CS.limit) 

EXCEPTION [#GP(0)] 

CS.sel = temp_CS 

CS.base = temp_CS SHL 4 

RFLAGS.v = temp_RFLAGS // VIF,VIP,VM unchanged 

RIP = temp RIP 

EXIT 


IRET_PROTECTED: 

IF (RFLAGS. NT==1 ) 

IF (LEGACY_MODE) 

TASK_SWITCH 

ELSE 

EXCEPTION [#GP(0)] 

POP.v temp RIP 
POP.v temp_CS 
POP.v temp_RFLAGS 

IF ((temp_RFLAGS.VM==1) && (CPL==0) && (LEGACY_MODE)) 

IRET_FROM_PROTECTED_TO_VIRTUAL 

temp CPL = temp_CS.rpl 

IF ((64BIT_MODE) || (temp_CPL!-CPL)) 

{ 

POP.v temp RSP // in 64-bit mode, iret always pops ss:rsp 

POP.v temp SS 

} 

CS = READ_DESCRIPTOR (temp^CS, iret^chk) 

IF ((64BIT MODE) && (temp RIP is non-canonical) 

|| (!64BIT_MODE) && (temp_RIP > CS.limit)) 

{ 

EXCEPTION [#GP(0 ) ] 

} 

CPL = temp_CPL 

IF ((started in 64-bit mode) || (changing CPL)) 

// ss:rsp were popped, so load them into the registers 

{ 

SS = READ_DESCRIPTOR (tempJSS, ss_chk) 

RSP.s = temp RSP 

} 


// iret does a task-switch to a previous task 

// using the 'back link' field in the tss 
// (LONG_MODE) 

// task switches aren't supported in long mode 
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IF (changing CPL) 

{ 

FOR (seg = ES, DS, FS, GS) 

IF ((seg.attr.dpi < CPL) && ((seg.attr.type == 'data') 

I (seg.attr.type == 'non-conforming-code'))) 

{ 

seg = NULL // can't use lower dpi data segment at higher cpl 

} 

} 

RFLAGS.v = temp_RFLAGS // VIF,VIP,IOPL only changed if (old_CPL==0) 

// IF only changed if (old_CPL<=old_RFLAGS.IOPL) 

// VM unchanged 
// RF cleared 

RIP = temp RIP 
EXIT 


IRET_VIRTUAL: 

IF ((RFLAGS.IOPL<3) && (CR4.VME==0)) 

EXCEPTION [#GP(0)] 

POP.v temp RIP 
POP.v temp_CS 
POP.v temp_RFLAGS 

IF (temp_RIP > CS.limit) 

EXCEPTION [#GP(0)] 

IF (RFLAGS.IOPL==3) 

{ 

RFLAGS.v = temp_RFLAGS // VIF,VIP,VM,IOPL unchanged 

// RF cleared 

CS.sel = temp_CS 

CS.base = temp_CS SHL 4 

RIP = temp RIP 
EXIT 

} 

// now ((IOPL<3) && (CR4.VME==1) 

ELSIF ((OPERAND_SIZE==16) 

&& !((temp_RFLAGS.IF==1) && (RFLAGS,VIP==1)) 

& & (temp_RFLAGS.TF==0)) 

{ 

RFLAGS.w = temp_RFLAGS // RFLAGS.VIF=temp_RFLAGS.IF 

// IF,IOPL unchanged 
// RF cleared 

CS.sel = temp_CS 

CS.base = temp_CS SHL 4 
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RIP = temp RIP 
EXIT 

} 

ELSE // ((RFLAGS.I0PL<3) && (CR4.VME==1) && ((OPERANDJ3IZE==32) | 

// ((temp_RFLAGS.IF==1) && (RFLAGS,VIP==1)) || (temp_RFLAGS.TF==1))) 

EXCEPTION [#GP(0)] 


IRET_FROM_PROTECTED_TO_VIRTUAL: 

// temp_RIP already popped 
// temp_CS already popped 

// temp RFLAGS already popped, temp RFLAGS.VM=1 

POP.d temp RSP 
POP.d temp_SS 
POP.d temp ES 
POP.d temp DS 
POP.d temp FS 
POP.d temp GS 

CS.sel = temp CS // force the segments to have virtual-mode values 

CS.base = temp_CS SHL 4 
CS.limit= OxOOOOFFFF 
CS.attr = 16-bit dpl3 code 

SS.sel = temp_SS 
SS.base = temp SS SHL 4 
SS.limit= OxOOOOFFFF 
SS.attr = 16-bit dpl3 stack 

DS.sel = temp DS 
DS.base = temp DS SHL 4 
DS.limit= OxOOOOFFFF 
DS.attr = 16-bit dpl3 data 

ES.sel = temp ES 
ES.base = temp ES SHL 4 
ES.limit= OxOOOOFFFF 
ES.attr = 16-bit dpl3 data 

FS.sel = temp FS 
FS.base = temp FS SHL 4 
FS.limit= OxOOOOFFFF 
FS.attr = 16-bit dpl3 data 

GS.sel = temp_GS 
GS.base = temp_GS SHL 4 
GS.limit= OxOOOOFFFF 
GS.attr = 16-bit dpl3 data 
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RSP.d = temp RSP 
RFLAGS.d = t emp_RF LAG S 
CPL = 3 

RIP = temp_RIP AND OxOOOOFFFF 
EXIT 

Related Instructions 

INT, INTO, INT3 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to one or cleared to zero is M (modified). Unaffected flags 
are blank. Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Segment not 
present, #NP 
(selector) 



X 

The return code segment was marked not present. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

Stack, #SS 
(selector) 



X 

The SS register was loaded with a non-null segment selector 
and the segment was marked not present. 

General protection, 
#GP 

X 

X 

X 

The target offset exceeded the code segment limit or was non- 
canonical. 


X 


IOPL was less than 3 and one of the following conditions was 
true: 

• CR4.VME was 0. 

• The effective operand size was 32-bit. 

• Both the original EFLAGS.VIP and the new EFLAGS.IF 
were set. 

• The new EFLAGS.TF was set. 



X 

IRETx was executed in long mode while EFLAGS.NT=1. 
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Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 




X 

The return code selector was a null selector. 




X 

The return stack selector was a null selector and the return 
mode was non-64-bit mode or CPL was 3. 




X 

The return code or stack descriptor exceeded the descriptor 
table limit. 




X 

The return code or stack selector’s Tl bit was set but the LDT 
selector was a null selector. 




X 

The segment descriptor for the return code was not a code 
segment. 

General protection, 
#GP 

(selector) 



X 

The RPL of the return code segment selector was less than 
the CPL. 



X 

The return code segment was non-conforming and the 
segment selector’s DPL was not equal to the RPL of the code 
segment’s segment selector. 




X 

The return code segment was conforming and the segment 
selector’s DPL was greater than the RPL of the code 
segment’s segment selector. 




X 

The segment descriptor for the return stack was not a writable 
data segment. 




X 

The stack segment descriptor DPL was not equal to the RPL 
of the return code segment selector. 




X 

The stack segment selector RPL was not equal to the RPL of 
the return code segment selector. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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LAR Load Access Rights Byte 

Loads the access rights from the segment descriptor specified by a 16-bit source register or memory 

operand into a specified 16-bit, 32-bit, or 64-bit general-purpose register and sets the zero (ZF) flag in 

the rFLAGS register if successful. LAR clears the zero flag if the descriptor is invalid for any reason. 

The LAR instruction checks that: 

• the segment selector is not a null selector. 

• the descriptor is within the GDT or LDT limit. 

• the descriptor DPL is greater than or equal to both the CPL and RPL, or the segment is a 
confonning code segment. 

• the descriptor type is valid for the LAR instruction. Valid descriptor types are shown in the 
following table. LDT and TSS descriptors in 64-bit mode, and call-gate descriptors in long mode, 
are only valid if bits 12:8 of doubleword +12 are zero. 

See Volume 2, Section 6.4 for more infonnation on checking access rights using LAR. 


Valid Descriptor Type 

Description 

Legacy Mode 

Long Mode 


All 

All 

All code and data descriptors 

1 

— 

Available 16-bit TSS 

2 

2 

LDT 

3 

— 

Busy 16-bit TSS 

4 

— 

16-bit call gate 

5 

— 

Task gate 

9 

9 

Available 32-bit or 64-bit TSS 

B 

B 

Busy 32-bit or 64-bit TSS 

C 

C 

32-bit or 64-bit call gate 


If the segment descriptor passes these checks, the attributes are loaded into the destination general- 
purpose register. If it does not, then the zero flag is cleared and the destination register is not modified. 

When the operand size is 16 bits, access rights include the DPL and Type fields located in bytes 4 and 
5 of the descriptor table entry. Before loading the access rights into the destination operand, the low 
order word is masked with FFOOH. 

When the operand size is 32 or 64 bits, access rights include the DPL and type as well as the descriptor 
type (S field), segment present (P flag), available to system (AVL flag), default operation size (D/B 
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flag), and granularity flags located in bytes 4-7 of the descriptor. Before being loaded into the 
destination operand, the doubleword is masked with 00FF FF00H. 

In 64-bit mode, for both 32-bit and 64-bit operand sizes, 32-bit register results are zero-extended to 64 
bits. 

This instruction can only be executed in protected mode. 


Mnemonic 

Opcode 

Description 

LAR reg16, reg/mem16 

OF 02 /r 

Reads the GDT/LDT descriptor referenced by the 16-bit 
source operand, masks the attributes with FFOOh and saves 
the result in the 16-bit destination register. 

LAR reg32, reg/mem16 

OF 02 /r 

Reads the GDT/LDT descriptor referenced by the 16-bit 
source operand, masks the attributes with OOFFFFOOh and 
saves the result in the 32-bit destination register. 

LAR reg64, reg/mem16 

OF 02 /r 

Reads the GDT/LDT descriptor referenced by the 16-bit 
source operand, masks the attributes with OOFFFFOOh and 


saves the result in the 64-bit destination register. 

Related Instructions 

ARPL, LSL, VERR, VERW 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 














M 




21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to one or zero is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, #UD 

X 

X 


This instruction is only recognized in protected mode. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 

#GP 



X 

A memory address exceeded the data segment limit or was 
non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, #AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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LGDT Load Global Descriptor Table Register 

Loads the pseudo-descriptor specified by the source operand into the global descriptor table register 
(GDTR). The pseudo-descriptor is a memory location containing the GDTR base and limit. In legacy 
and compatibility mode, the pseudo-descriptor is 6 bytes; in 64-bit mode, it is 10 bytes. 

If the operand size is 16 bits, the high-order byte of the 6-byte pseudo-descriptor is not used. The lower 
two bytes specify the 16-bit limit and the third, fourth, and fifth bytes specify the 24-bit base address. 
The high-order byte of the GDTR is filled with zeros. 

If the operand size is 32 bits, the lower two bytes specify the 16-bit limit and the upper four bytes 
specify a 32-bit base address. 

In 64-bit mode, the lower two bytes specify the 16-bit limit and the upper eight bytes specify a 64-bit 
base address. In 64-bit mode, operand-size prefixes are ignored and the operand size is forced to 64- 
bits; therefore, the pseudo-descriptor is always 10 bytes. 

This instruction is only used in operating system software and must be executed at CPL 0. It is 
typically executed once in real mode to initialize the processor before switching to protected mode. 

LGDT is a serializing instruction. 

Mnemonic Opcode 

LGDT me ml 6:32 OF 01 12 
LGDT mem16:64 OF 01 12 

Related Instructions 

LIDT, LLDT, LTR, SGDT, SIDT, SLDT, STR 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

The operand was a register. 

Stack, #SS 

X 


X 

A memory address exceeded the stack segment limit or was 
non-canonical. 


Description 

Loads mem16:32 into the global descriptor table register. 
Loads mem16:64 into the global descriptor table register. 
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Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 

#GP 

X 


X 

A memory address exceeded the data segment limit or was 
non-canonical. 


X 

X 

CPL was not 0. 



X 

The new GDT base address was non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 
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LIDT Load Interrupt Descriptor Table Register 

Loads the pseudo-descriptor specified by the source operand into the interrupt descriptor table register 
(IDTR). The pseudo-descriptor is a memory location containing the IDTR base and limit. In legacy 
and compatibility mode, the pseudo-descriptor is six bytes; in 64-bit mode, it is 10 bytes. 

If the operand size is 16 bits, the high-order byte of the 6-byte pseudo-descriptor is not used. The lower 
two bytes specify the 16-bit limit and the third, fourth, and fifth bytes specify the 24-bit base address. 
The high-order byte of the IDTR is filled with zeros. 

If the operand size is 32 bits, the lower two bytes specify the 16-bit limit and the upper four bytes 
specify a 32-bit base address. 

In 64-bit mode, the lower two bytes specify the 16-bit limit, and the upper eight bytes specify a 64-bit 
base address. In 64-bit mode, operand-size prefixes are ignored and the operand size is forced to 64- 
bits; therefore, the pseudo-descriptor is always 10 bytes. 

This instruction is only used in operating system software and must be executed at CPL 0. It is 
normally executed once in real mode to initialize the processor before switching to protected mode. 

LIDT is a serializing instruction. 

Mnemonic Opcode 

LIDT mem16:32 OF 01 13 
LIDT mem16:64 OF 01 13 

Related Instructions 

LGDT, LLDT, LTR, SGDT, SIDT, SLDT, STR 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

The operand was a register. 

Stack, #SS 

X 


X 

A memory address exceeded the stack segment limit or was 
non-canonical. 


Description 

Loads mem16:32 into the interrupt descriptor table register. 
Loads mem16:64 into the interrupt descriptor table register. 
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Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 

#GP 

X 


X 

A memory address exceeded the data segment limit or was 
non-canonical. 


X 

X 

CPL was not 0. 



X 

The new IDT base address was non-canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 
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LLDT Load Local Descriptor Table Register 

Loads the specified segment selector into the visible portion of the local descriptor table (LDT). The 
processor uses the selector to locate the descriptor for the LDT in the global descriptor table. It then 
loads this descriptor into the hidden portion of the LDTR. 

If the source operand is a null selector, the LDTR is marked invalid and all references to descriptors in 
the LDT will generate a general protection exception (#GP), except for the LAR, VERR, VERW or 
LSL instructions. 

In legacy and compatibility modes, the LDT descriptor is 8 bytes long and contains a 32-bit base 
address. 

In 64-bit mode, the LDT descriptor is 16-bytes long and contains a 64-bit base address. The LDT 
descriptor type (02h) is redefined in 64-bit mode for use as the 16-byte LDT descriptor. 

This instruction must be executed in protected mode. It is only provided for use by operating system 
software at CPL 0. 

LLDT is a serializing instruction. 


Mnemonic Opcode Description 

LLDT of 00 12 Load the 16-bit segment selector into the local descriptor 

reg/mem16 table register and load the LDT descriptor from the GDT. 

Related Instructions 

LGDT, LIDT, LTR, SGDT, SIDT, SLDT, STR 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, #UD 

X 

X 


This instruction is only recognized in protected mode. 

Segment not present, 
#NP (selector) 



X 

The LDT descriptor was marked not present. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

CPL was not 0. 



X 

A null data segment was used to reference memory. 
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Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 
#GP 

(selector) 



X 

The source selector did not point into the GDT. 



X 

The descriptor was beyond the GDT limit. 



X 

The descriptor was not an LDT descriptor. 



X 

The descriptor's extended attribute bits were not zero in 64- 
bit mode. 



X 

The new LDT base address was non-canonical. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 
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LMSW Load Machine Status Word 

Loads the lower four bits of the 16-bit register or memory operand into bits 3:0 of the machine status 
word in register CRO. Only the protection enabled (PE), monitor coprocessor (MP), emulation (EM), 
and task switched (TS) bits of CRO are modified. Additionally, LMSW can set CRO.PE, but cannot 
clear it. 

The LMSW instruction can be used only when the current privilege level is 0. It is only provided for 
compatibility with early processors. 

Use the MOV CRO instruction to load all 32 or 64 bits of CRO. 


Mnemonic Opcode 

LMSW reg/mem16 OF 01 16 

Related Instructions 

MOV CRn, SMSW 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 


X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 


X 

A memory address exceeded a data segment limit or was non- 
canonical. 


X 

X 

CPL was not 0. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 


Description 

Load the lower 4 bits of the source into the lower 4 bits of 
CRO. 
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LSL Load Segment Limit 

Loads the segment limit from the segment descriptor specified by a 16-bit source register or memory 
operand into a specified 16-bit, 32-bit, or 64-bit general-purpose register and sets the zero (ZF) flag in 
the rFLAGS register if successful. LSL clears the zero flag if the descriptor is invalid for any reason. 

In 64-bit mode, for both 32-bit and 64-bit operand sizes, 32-bit register results are zero-extended to 64 
bits. 

The LSL instruction checks that: 

• the segment selector is not a null selector. 

• the descriptor is within the GDT or LDT limit. 

• the descriptor DPL is greater than or equal to both the CPL and RPL, or the segment is a 
confonning code segment. 

• the descriptor type is valid for the LAR instruction. Valid descriptor types are shown in the 
following table. LDT and TSS descriptors in 64-bit mode are only valid if bits 12:8 of doubleword 
+ 12 are zero, as described in “System Descriptors” in Volume 2. 


Valid Descriptor Type 

Description 

Legacy Mode 

Long Mode 


— 

— 

All code and data descriptors 

1 

— 

Available 16-bit TSS 

2 

2 

LDT 

3 

— 

Busy 16-bit TSS 

9 

9 

Available 32-bit or 64-bit TSS 

B 

B 

Busy 32-bit or 64-bit TSS 


If the segment selector passes these checks and the segment limit is loaded into the destination 
general-purpose register, the instruction sets the zero flag of the rFLAGS register to 1. If the selector 
does not pass the checks, then LSL clears the zero flag to 0 and does not modify the destination. 

The instruction calculates the segment limit to 32 bits, taking the 20-bit limit and the granularity bit 
into account. When the operand size is 16 bits, it truncates the upper 16 bits of the 32-bit adjusted 
segment limit and loads the lower 16-bits into the target register. 


Mnemonic Opcode 


LSL reg16, reg/mem16 OF 03 /r 


Description 

Loads a 16-bit general-purpose register with the segment 
limit for a selector specified in a 16-bit memory or register 
operand. 
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Loads a 32-bit general-purpose register with the segment 
limit for a selector specified in a 16-bit memory or register 
operand. 

Loads a 64-bit general-purpose register with the segment 
limit for a selector specified in a 16-bit memory or register 
operand. 

Related Instructions 

ARPL, LAR, VERR, VERW 


LSL reg32, reg/mem16 OF 03 /r 

LSL reg64, reg/mem16 OF 03 /r 


rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 














M 




21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 1 5, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 


This instruction is only recognized in protected mode. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 



X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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LTR Load Task Register 

Loads the specified segment selector into the visible portion of the task register (TR). The processor 
uses the selector to locate the descriptor for the TSS in the global descriptor table. It then loads this 
descriptor into the hidden portion of TR. The TSS descriptor in the GDT is marked busy, but no task 
switch is made. 

If the source operand is null, a general protection exception (#GP) is generated. 

In legacy and compatibility modes, the TSS descriptor is 8 bytes long and contains a 32-bit base 
address. 

In 64-bit mode, the instruction references a 64-bit descriptor to load a 64-bit base address. The TSS 
type (09H) is redefined in 64-bit mode for use as the 16-byte TSS descriptor. 

This instruction must be executed in protected mode when the current privilege level is 0. It is only 
provided for use by operating system software. 

The operand size attribute has no effect on this instruction. 

LTR is a serializing instruction. 


Mnemonic Opcode 

LTR reg/mem16 OF 00 13 

Related Instructions 


Description 

Load the 16-bit segment selector into the task register and 
load the TSS descriptor from the GDT. 


LGDT, LIDT, LLDT, STR, SGDT, SIDT, SLDT 

rFLAGS Affected 


None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, #UD 

X 

X 


This instruction is only recognized in protected mode. 

Segment not present, 
#NP (selector) 



X 

The TSS descriptor was marked not present. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or was 
non-canonical. 
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Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 

#GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

CPL was not 0. 



X 

A null data segment was used to reference memory. 



X 

The new TSS selector was a null selector. 

General protection, 

#GP 

(selector) 



X 

The source selector did not point into the GDT. 



X 

The descriptor was beyond the GDT limit. 



X 

The descriptor was not an available TSS descriptor. 



X 

The descriptor's extended attribute bits were not zero in 64- 
bit mode. 



X 

The new TSS base address was non-canonical. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 
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MONITOR Setup Monitor Address 

Establishes a linear address range of memory for hardware to monitor and puts the processor in the 
monitor event pending state. When in the monitor event pending state, the monitoring hardware 
detects stores to the specified linear address range and causes the processor to exit the monitor event 
pending state. The MWAIT instruction uses the state of the monitor hardware. 

The address range should be a write-back memory type. Executing MONITOR on an address range for 
a non-write-back memory type is not guaranteed to cause the processor to enter the monitor event 
pending state. The size of the linear address range that is established by the MONITOR instruction can 
be detennined by CPUID function 0000_0005h. 

The [rAX] register provides the effective address. The DS segment is the default segment used to 
create the linear address. Segment overrides may be used with the MONITOR instruction. 

The ECX register specifies optional extensions for the MONITOR instruction. There are currently no 
extensions defined and setting any bits in ECX will result in a #GP exception. The ECX register 
operand is implicitly 32-bits. 

The EDX register specifies optional hints for the MONITOR instruction. There are currently no hints 
defined and EDX is ignored by the processor. The EDX register operand is implicitly 32-bits. 

The MONITOR instruction can be executed at CPL 0 and is allowed at CPL > 0 
only if MSR C001_0015h[MonMwaitUserEn] = 1. When MSR C001_0015h[MonMwaitUserEn] = 0, 
MONITOR generates #UD at CPL > 0. (See the BIOS and Kernel Developer s Guide applicable to 
your product for specific details on MSR C001_0015h.) 

MONITOR performs the same segmentation and paging checks as a 1-byte read. 

Support for the MONITOR instruction is indicated by CPUID Fn0000_0001_ECX[MONITOR] = 1. 
Software must check the CPUID bit once per program or library initialization before using the 
MONITOR instruction, or inconsistent behavior may result. Software designed to run at CPL greater 
than 0 must also check for availability by testing whether executing MONITOR causes a #UD 
exception. 

The following pseudo-code shows typical usage of a MONITOR/MWAIT pair: 

EAX = Linear_Address_to Monitor; 

ECX =0; // Extensions 

EDX =0; // Hints 


while (!matching_store done){ 
MONITOR EAX, ECX, EDX 
IF (Imatching store done) { 
MWAIT EAX, ECX 


} 
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Description 

Establishes a linear address range to be monitored 
by hardware and activates the monitor hardware. 


Related Instructions 

MWAIT, MONITORX, MWAITX 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

The MONITOR/MWAIT instructions are not 
supported, as indicated by 

CPUID FnOOOO_OOC)1_ECX[MONITOR] = 0. 


X 

X 

CPL was not zero and 

MSR C001_0015[MonMwaitUserEn] = 0. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit 
or was non-canonical. 

General protection, #GP 

X 

X 

X 

A memory address exceeded a data segment limit or 
was non-canonical. 

X 

X 

X 

ECX was non-zero. 



X 

A null data segment was used to reference memory. 

Page Fault, #PF 


X 

X 

A page fault resulted from the execution of the 
instruction. 


Mnemonic Opcode 

MONITOR 0F01C8 
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MOV CRn Move to/from Control Registers 

Moves the contents of a 32-bit or 64-bit general-purpose register to a control register or vice versa. 

In 64-bit mode, the operand size is fixed at 64 bits without the need for a REX prefix. In non-64-bit 
mode, the operand size is fixed at 32 bits and the upper 32 bits of the destination are forced to 0. 

CRO maintains the state of various control bits. CR2 and CR3 are used for page translation. CR4 holds 
various feature enable bits. CR8 is used to prioritize external interrupts. CR1, CR5, CR6, CR7, and 
CR9 through CR15 are all reserved and raise an undefined opcode exception (#UD) if referenced. 

CR8 can be read and written in 64-bit mode, using a REX prefix. CR8 can be read and written in all 
modes using a LOCK prefix instead of a REX prefix to specify the additional opcode bit. To verify 
whether the LOCK prefix can be used in this way, check for support of this feature. CPUID 
Fn8000_0001_ECX[AltMovCr8] = 1, indicates that this feature is supported. 

For more infonnation on using the CPUID instruction, see the description of the CPUID instruction on 
page 160. 

CR8 can also be read and modified using the task priority register described in “System-Control 
Registers” in Volume 2. 

This instruction is always treated as a register-to-register (MOD = 11) instruction, regardless of the 
encoding of the MOD field in the MODR/M byte. 

MOV CRn is a privileged instruction and must always be executed at CPL = 0. 

MOV CRn is a serializing instruction. 


Mnemonic 

Opcode 

Description 

MOV CRn, reg32 

OF 22 /r 

Move the contents of a 32-bit register to CRn 

MOV CRn, reg64 

OF 22 It 

Move the contents of a 64-bit register to CRn 

MOV reg32, CRn 

OF 20 /r 

Move the contents of CRn to a 32-bit register. 

MOV reg64, CRn 

OF 20 /r 

Move the contents of CRn to a 64-bit register. 

MOV CR8, reg32 

F0 OF 22/r 

Move the contents of a 32-bit register to CR8. 

MOV CR8, reg64 

F0 OF 22/r 

Move the contents of a 64-bit register to CR8. 

MOV reg32, CR8 

F0 OF 20/r 

Move the contents of CR8 into a 32-bit register. 

MOV reg64, CR8 

F0 OF 20/r 

Move the contents of CR8 into a 64-bit register. 


Related Instructions 

CLTS, LMSW, SMSW 
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rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid Instruction, 

X 

X 

X 

An illegal control register was referenced (CR1, CR5-CR7, 
CR9-CR15). 

#UD 

X 

X 

X 

The use of the LOCK prefix to read CR8 is not supported, as 
indicated by CPUID Fn8000_0001_ECX[AltMovCr8] = 0. 



X 

X 

CPL was not 0. 


X 


X 

An attempt was made to set CRO.PG = 1 and CRO.PE = 0. 


X 


X 

An attempt was made to set CRO.CD = 0 and CRO.NW = 1. 

General protection, 

X 


X 

Reserved bits were set in the page-directory pointers table 
(used in the legacy extended physical addressing mode) and 
the instruction modified CRO, CR3, or CR4. 

#GP 

X 


X 

An attempt was made to write 1 to any reserved bit in CRO, 
CR3, CR4 or CR8. 


X 


X 

An attempt was made to set CRO.PG while long mode was 
enabled (EFER.LME = 1), but paging address extensions 
were disabled (CR4.PAE = 0). 




X 

An attempt was made to clear CR4.PAE while long mode was 
active (EFER.LMA= 1). 
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MOV DRn Move to/from Debug Registers 

Moves the contents of a debug register into a 32-bit or 64-bit general-purpose register or vice versa. 

In 64-bit mode, the operand size is fixed at 64 bits without the need for a REX prefix. In non-64-bit 
mode, the operand size is fixed at 32-bits and the upper 32 bits of the destination are forced to 0. 

DRO through DR3 are linear breakpoint address registers. DR6 is the debug status register and DR7 is 
the debug control register. DR4 and DR5 are aliased to DR6 and DR7 if CR4.DE = 0, and are reserved 
if CR4.DE = 1. 

DR8 through DR 15 are reserved and generate an undefined opcode exception if referenced. 

These instructions are privileged and must be executed at CPL 0. 

The MOV DRn, reg32 and MOV DRn, reg64 instructions are serializing instructions. 

The MOV(DR) instruction is always treated as a register-to-register (MOD =11) instruction, 
regardless of the encoding of the MOD field in the MODR/M byte. 

See “Debug and Perfonnance Resources” in Volume 2 for details. 


Mnemonic 

Opcode 

Description 

MOV reg32, DRn 

OF 21 /r 

Move the contents 

MOV reg64, DRn 

OF 21 /r 

Move the contents 

MOV DRn, reg32 

OF 23 /r 

Move the contents 

MOV DRn, reg64 

OF 23 /r 

Move the contents 


of DRn to a 32-bit register, 
of DRn to a 64-bit register, 
of a 32-bit register to DRn. 
of a 64-bit register to DRn. 


Related Instructions 

None 


rFLAGS Affected 

None 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Debug, #DB 

X 


X 

A debug register was referenced while the general detect 
(GD) bit in DR7 was set. 

Invalid opcode, #UD 

X 


X 

DR4 or DR5 was referenced while the debug extensions 
(DE) bit in CR4 was set. 



X 

An illegal debug register (DR8-DR15) was referenced. 

General protection, 
#GP 


X 

X 

CPL was not 0. 



X 

A 1 was written to any of the upper 32 bits of DR6 or DR7 in 
64-bit mode. 
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MWAIT Monitor Wait 

Used in conjunction with the MONITOR instruction to cause a processor to wait until a store occurs to 
a specific linear address range from another processor. The previously executed MONITOR 
instruction causes the processor to enter the monitor event pending state. The MWAIT instruction may 
enter an implementation dependent power state until the monitor event pending state is exited. The 
MWAIT instruction has the same effect on architectural state as the NOP instruction. 

Events that cause an exit from the monitor event pending state include: 

• A store from another processor matches the address range established by the MONITOR 
instruction. 

• Any unmasked interrupt, including INTR, NMI, SMI, INIT. 

• RESET. 

• Any far control transfer that occurs between the MONITOR and the MWAIT. 

EAX specifies optional hints for the MWAIT instruction. Optimized C-state request is communicated 
through EAX[7:4]. The processor C-state is EAX[7:4]+1, so to request CO is to place the value F in 
EAX[7:4] and to request Cl is to place the value 0 in EAX[7:4]. All other components of EAX should 
be zero when making the Cl request. Setting a reserved bit in EAX is ignored by the processor. 

ECX specifies optional extensions for the MWAIT instruction. The only extension currently defined is 
ECX bit 0, which allows interrupts to wake MWAIT, even when eFLAGS.IF = 0. Support for this 
extension is indicated by a feature flage returned by the CPUID instruction. Setting any unsupported 
bit in ECX results in a #GP exception. 

CPUID Function 0000_0005h indicates support for extended features of MONITOR/MWAIT: 

• CPUID Fn0000_0005_ECX[EMX] = 1 indicates support for enumeration of MONITOR/MWAIT 
extensions. 

• CPUID Fn0000_0005_ECX[IBE] = 1 indicates that MWAIT can set ECX[0] to allow interrupts to 
cause an exit from the monitor event pending state even when eFLAGS.IF = 0. 

The MWAIT instruction can be executed at CPL 0 and is allowed at CPL > 0 only if MSR 
COO 1_0015h[MonMwaitUserEn] =1. When MSR C001_0015h[MonMwaitUserEn] is 0, MWAIT 
generates #UD at CPL > 0. (See the BIOS and Kernel Developer s Guide applicable to your product 
for specific details on MSR C001_0015h.) 

Support for the MWAIT instruction is indicated by CPUID FnOOOOOOOlECXfMONITOR] = 1. 
Software MUST check the CPUID bit once per program or library initialization before using the 
MWAIT instruction, or inconsistent behavior may result. Software designed to run at CPL greater than 
0 must also check for availability by testing whether executing MWAIT causes a #UD exception. 

The use of the MWAIT instruction is contingent upon the satisfaction of the following coding 
requirements: 

• MONITOR must precede the MWAIT and occur in the same loop. 


398 


MWAIT 


System Instruction Reference 



24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


• MWAIT must be conditionally executed only if the awaited store has not already occurred. (This 
prevents a race condition between the MONITOR instruction arming the monitoring hardware and 
the store intended to trigger the monitoring hardware.) 

The following pseudo-code shows typical usage of a MONITOR/MWAIT pair: 

EAX = Linear_Address_to Monitor; 

ECX =0; // Extensions 

EDX =0; // Hints 

WHILE (!matching_store_done ){ 

MONITOR EAX, ECX, EDX 

IF ( Imatching store done ) { 

MWAIT EAX, ECX 

} 

} 


Mnemonic 


MWAIT 


Opcode Description 

Causes the processor to stop instruction execution 
OF 01 C9 and enter an implementation-dependent optimized 

state until occurrence of a class of events. 


Related Instructions 
MONITOR 
rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

The MONITOR/MWAIT instructions are not supported, 
as indicated by 

CPUID Fn0000_0001_ECX[MONITOR] = 0. 


X 

X 

CPL was not zero and 
MSRC001_0015[MonMwaitUserEn] = 0. 

General protection, 

#GP 

X 

X 

X 

Unsupported extension bits were set in ECX 
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RDMSR Read Model-Specific Register 

Loads the contents of a 64-bit model-specific register (MSR) specified in the ECX register into 
registers EDX:EAX. The EDX register receives the high-order 32 bits and the EAX register receives 
the low order bits. The RDMSR instruction ignores operand size; ECX always holds the MSR number, 
and EDX: EAX holds the data. If a model-specific register has fewer than 64 bits, the unimplemented 
bit positions loaded into the destination registers are undefined. 

This instruction must be executed at a privilege level of 0 or a general protection exception (#GP) will 
be raised. This exception is also generated if a reserved or unimplemented model-specific register is 
specified in ECX. 

Support for the RDMSR instruction is indicated by CPUID Fn0000_0001_EDX[MSR] = 1 OR 
CPUID Fn8000_0001_EDX[MSR] = 1. For more infonnation on using the CPUID instruction, see the 
description of the CPUID instruction on page 160. 

For more information about model-specific registers, see the documentation for various hardware 
implementations and “Model-Specific Registers (MSRs)” in Volume 2: System Programming. 


Mnemonic Opcode 

RDMSR OF 32 

Related Instructions 

WRMSR, RDTSC, RDPMC 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

The RDMSR instruction is not supported, as indicated by 
CPUID FnOOOO 0001 EDX[MSR] = 0 or CPUID 
Fn8000_0001_EDX[MSR] = 0. 

General protection, 
#GP 


X 

X 

CPL was not 0. 

X 


X 

The value in ECX specifies a reserved or unimplemented 

MSR address. 


Description 

Copy MSR specified by ECX into EDX:EAX. 


400 


RDMSR 


System Instruction Reference 






24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


RDPMC Read Performance-Monitoring Counter 

Reads the contents of a 64-bit performance counter and returns it in the registers EDX:EAX. The ECX 
register is used to specified by the index of the performance counter to be read. The EDX register 
receives the high-order 32 bits and the EAX register receives the low order 32 bits of the counter. The 
RDPMC instruction ignores operand size; the index and the return values are all 32 bits. 

The base architecture supports four core performance counters: PerfCtrO-3. An extension to the 
architecture increases the number of core performance counters to 6 (PerfCtrO-5). Other extensions 
add four northbridge perfonnance counters NB_PerfCtrO-3 and four L2 cache performance counters 
L2I_PerfCtrO-3. 

To select the core performance counter to be read, specify the counter index, rather than the 
performance counter MSR address. To access the northbridge performance counters, specify the index 
of the counter plus 6. To access the L2 cache performance counters, specify the index of the counter 
plus 10. 

Programs running at any privilege level can read perfonnance monitor counters if the PCE flag in CR4 
is set to 1; otherwise this instruction must be executed at a privilege level of 0. 

This instruction is not serializing. Therefore, there is no guarantee that all instructions have completed 
at the time the perfonnance counter is read. 

For more information about performance-counter registers, see the documentation for various 
hardware implementations and “Performance Counters” in Volume 2. 

Support for the core performance counters PerfCtr4-5 is indicated by CPUID 
Fn8000_000 l_ECX[PerfCtrExtCore] = 1. CPUID Fn8000_0001_ECX[PerfCtrExtNB] = 1 indicates 
support for the four architecturally defined northbridge performance counters and CPUID 
Fn8000_0001_ECX[PerfCtrExtL2I] = 1 indicates support for the L2 cache performance counters. 

For more infonnation on using the CPUID instruction, see the description of the CPUID instruction on 
page 160. 

Instruction Encoding 


Description 

Copy the performance monitor counter specified 
by ECX into EDX:EAX. 

Related Instructions 

RDMSR, WRMSR 

rFLAGS Affected 

None 


Mnemonic Opcode 

RDPMC OF 33 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General Protection, 
#GP 

X 

X 

X 

The value in ECX specified an unimplemented performance 
counter number. 


X 

X 

CPL was not 0 and CR4.PCE = 0. 
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RDTSC Read Time-Stamp Counter 

Loads the value of the processor’s 64-bit time-stamp counter into registers EDX:EAX. 

The time-stamp counter (TSC) is contained in a 64-bit model-specific register (MSR). The processor 
sets the counter to 0 upon reset and increments the counter every clock cycle. INIT does not modify the 
TSC. 

The high-order 32 bits are loaded into EDX, and the low-order 32 bits are loaded into the EAX 
register. This instruction ignores operand size. 

When the time-stamp disable flag (TSD) in CR4 is set to 1, the RDTSC instruction can only be used at 
privilege level 0. If the TSD flag is 0, this instruction can be used at any privilege level. 

This instruction is not serializing. Therefore, there is no guarantee that all instructions have completed 
at the time the time-stamp counter is read. 

The behavior of the RDTSC instruction is implementation dependent. The TSC counts at a constant 
rate, but may be affected by power management events (such as frequency changes), depending on the 
processor implementation. If CPUID Fn8000_0007_EDX[TscInvariant] = 1, then the TSC rate is 
ensured to be invariant across all P-States, C-States, and stop-grant transitions (such as STPCLK 
Throttling); therefore, the TSC is suitable for use as a source of time. Consult the BIOS and Kernel 
Developer’s Guide applicable to your product for information concerning the effect of power 
management on the TSC. 

Support for the RDTSC instruction is indicated by CPUID Fn0000_0001_EDX[TSC] = 1 OR CPUID 
Fn8000_0001_EDX[TSC] = 1. For more information on using the CPUID instruction, see the 
description of the CPUID instruction on page 160. 

Instruction Encoding 

Mnemonic Opcode Description 

RDTSC OF 31 Copy the time-stamp counter into EDX:EAX. 

Related Instructions 

RDTSCP, RDMSR, WRMSR 

rFLAGS Affected 

None 


System Instruction Reference 


RDTSC 


403 



AMpg 

AMD64 Technology 


24594 — Rev. 3.28—September 2019 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

The RDTSC instruction is not supported, as indicated by 
CPUID FnOOOO 0001 EDX[TSC] = 0 OR 

CPUID Fn8000_0001_EDX[TSC] = 0. 

General protection, 
#GP 


X 

X 

CPL was not 0 and CR4.TSD = 1. 
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RDTSCP Read Time-Stamp Counter 

and Processor ID 

Loads the value of the processor’s 64-bit time-stamp counter into registers EDX:EAX, and loads the 
value of TSC AUX into ECX. This instruction ignores operand size. 

The time-stamp counter is contained in a 64-bit model-specific register (MSR). The processor sets the 
counter to 0 upon reset and increments the counter every clock cycle. INIT does not modify the TSC. 

The high-order 32 bits are loaded into EDX, and the low-order 32 bits are loaded into the EAX 
register. 

The TSC_AUX value is contained in the low-order 32 bits of the TSC_AUX register (MSR address 
C000_0103h). This MSR is initialized by privileged software to any meaningful value, such as a 
processor ID, that software wants to associate with the returned TSC value. 

When the time-stamp disable flag (TSD) in CR4 is set to 1, the RDTSCP instruction can only be used 
at privilege level 0. If the TSD flag is 0, this instruction can be used at any privilege level. 

Unlike the RDTSC instruction, RDTSCP forces all older instructions to retire before reading the time- 
stamp counter. 

The behavior of the RDTSCP instruction is implementation dependent. The TSC counts at a constant 
rate, but may be affected by power management events (such as frequency changes), depending on the 
processor implementation. If CPUID Fn8000_0007_EDX[TscInvariant] = 1, then the TSC rate is 
ensured to be invariant across all P-States, C-States, and stop-grant transitions (such as STPCLK 
Throttling); therefore, the TSC is suitable for use as a source of time. Consult the BIOS and Kernel 
Developer’s Guide applicable to your product for information concerning the effect of power 
management on the TSC. 

Support for the RDTSCP instruction is indicated by CPUID Fn8000_0001_EDX[RDTSCP] = 1. For 
more information on using the CPUID instruction, see the description of the CPUID instruction on 
page 160. 

Instruction Encoding 

Description 

Copy the time-stamp counter into EDX:EAX and 
the TSCAUX register into ECX. 

Related Instructions 

RDTSC 

rFLAGS Affected 

None 


Mnemonic Opcode 

RDTSCP OF 01 F9 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

The RDTSCP instruction is not supported, as indicated by 
CPUID Fn8000_0001_EDX[RDTSCP] = 0. 

General protection, 
#GP 


X 

X 

CPL was not 0 and CR4.TSD = 1. 


406 


RDTSCP 


System Instruction Reference 






24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


RSM Resume from System Management Mode 

Resumes an operating system or application procedure previously interrupted by a system 
management interrupt (SMI). The processor state is restored from the information saved when the SMI 
was taken. The processor goes into a shutdown state if it detects invalid state information in the system 
management mode (SMM) save area during RSM. 

RSM will shut down if any of the following conditions are found in the save map (SSM): 

• An illegal combination of flags in CRO (CRO.PG = 1 and CRO.PE = 0, or CRO.NW = 1 and 
CRO.CD = 0). 

• A reserved bit in CR3, CR4, or the extended feature enable register (EFER) is set to 1. 

• A reserved bit in the range 63:32 of CRO, DR6, or DR7 is set to 1. 

• The following bit combination occurs: EFER.LME = 1, CRO.PG = 1, CR4.PAE = 0. 

• The following bit combination occurs: EFER.LME = 1, CRO.PG = 1, CR4.PAE = 1, CS.D = 1, 
CS.L = 1. 

• SMM revision field has been modified. 

RSM cannot modify EFER.SVME. Attempts to do so are ignored. 

When EFER.SVME is 1, RSM reloads the four PDPEs (through the incoming CR3) when returning to 
a mode that has legacy PAE mode paging enabled. 

When EFER.SVME is 1, the RSM instruction is permitted to return to paged real mode (i.e., 
CR0.PE=0 and CR0.PG=1). 

The AMD64 architecture uses a new 64-bit SMM state-save memory image. This 64-bit save-state 
map is used in all modes, regardless of mode. See “System-Management Mode” in Volume 2 for 
details. 


Description 

Resume operation of an interrupted program. 

Related Instructions 

None 


Mnemonic Opcode 

RSM OF AA 
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rFLAGS Affected 

All flags are restored from the state-save map (SSM). 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

The processor was not in System Management Mode (SMM). 
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SGDT Store Global Descriptor Table Register 

Stores the global descriptor table register (GDTR) into the destination operand. In legacy and 
compatibility mode, the destination operand is 6 bytes; in 64-bit mode, it is 10 bytes. In all modes, 
operand-size prefixes are ignored. 

In non-64-bit mode, the lower two bytes of the operand specify the 16-bit limit and the upper 4 bytes 
specify the 32-bit base address. 

In 64-bit mode, the lower two bytes of the operand specify the 16-bit limit and the upper 8 bytes 
specify the 64-bit base address. 

This instruction is intended for use in operating system software, but it can be used at any privilege 
level. 

Mnemonic Opcode Description 

SGDT mem16:32 OF 01 10 Store global descriptor table register to memory. 

SGDT mem16:64 OF 01 10 Store global descriptor table register to memory. 

Related Instructions 

SIDT, SLDT, STR, LGDT, LIDT, LLDT, LTR 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

The operand was a register. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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SIDT Store Interrupt Descriptor Table Register 

Stores the interrupt descriptor table register (IDTR) in the destination operand. In legacy and 
compatibility mode, the destination operand is 6 bytes; in 64-bit mode it is 10 bytes. In all modes, 
operand-size prefixes are ignored. 

In non-64-bit mode, the lower two bytes of the operand specify the 16-bit limit and the upper 4 bytes 
specify the 32-bit base address. 

In 64-bit mode, the lower two bytes of the operand specify the 16-bit limit and the upper 8 bytes 
specify the 64-bit base address. 

This instruction is intended for use in operating system software, but it can be used at any privilege 
level. 

Mnemonic Opcode Description 

SIDT mem16:32 OF 01 /I Store interrupt descriptor table register to memory. 

SIDT mem16:64 OF 01 /I Store interrupt descriptor table register to memory. 

Related Instructions 

SGDT, SLDT, STR, LGDT, LIDT, LLDT, LTR 

rFLAGS Affected 

None 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

The operand was a register. 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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SKINIT Secure Init and Jump with Attestation 

Securely reinitializes the cpu, allowing for the startup of trusted software (such as a VMM). The code 
to be executed after reinitialization can be verified based on a secure hash comparison. SKINIT takes 
the physical base address of the SLB as its only input operand, in EAX. The SLB must be structured as 
described in “Secure Loader Block” on page 499 of the AMD64 Architecture Programmer s Manual 
Volume 2: System Programming, order# 24593, and is assumed to contain the code for a Secure Loader 
(SL). 

This is a Secure Virtual Machine (SVM) instruction. Support for the SVM architecture and the SVM 
instructions is indicated by CPUID Fn8000_0001_ECX[SVM] = 1. For more information on using the 
CPUID instruction, see the reference page for the CPUID instruction on page 160. 

This instruction generates a #UD exception if SVM is not enabled. See “Enabling SVM” in AMD64 
Architecture Programmer s Manual Volume 2: System Instructions, order# 24593. 


Mnemonic 

Opcode 

Description 

SKINIT EAX 

OF 01 DE 

Secure initialization and jump, with attestation 

Action 




IF ((EFER.SVMEN == 0) && !(CPUID 8000_0001.ECX[SKINIT]) || (!PROTECTED_MODE)) 

EXCEPTION [#UD] // This instruction can only be executed 

// in protected mode with SVM enabled. 

IF (CPL != 0) // This instruction is only allowed at CPL 0. 

EXCEPTION [#GP] 

Initialize processor state as for an INIT signal 
CRO.PE = 1 

CS.sel = 0x0008 

CS.attr = 32-bit code, read/execute 

CS.base = 0 

CS.limit = OxFFFFFFFF 

SS.sel = 0x0010 

SS.attr = 32-bit stack, read/write, expand up 

SS.base = 0 

SS.limit = OxFFFFFFFF 

EAX = EAX & OxFFFFOOOO // Form SLB base address. 

EDX = family/model/stepping 

ESP = EAX + 0x00010000 // Initial SL stack. 

Clear GPRs other than EAX, EDX, ESP 

EFER = 0 
VM CR.DPD = 1 
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VM_CR.R_INIT = 1 
VM CR.DIS A20M = 1 


Enable SL DEV, to protect 64Kbyte of physical memory starting at 
the physical address in EAX 

GIF = 0 

Read the SL length from offset 0x0002 in the SLB 
Copy the SL image to the TPM for attestation 

Read the SL entrypoint offset from offset 0x0000 in the SLB 
Jump to the SL entrypoint, at EIP = EAX+entrypoint offset 

Related Instructions 

None. 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 1 5, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, #UD 



X 

Secure Virtual Machine was not enabled (EFER.SVME=0) 
and both of the following conditions were true: 

• SVM-Lock is not available, as indicated by 

CPU ID Fn8000_000A_EDX[SVML] = 0. 

• DEV is not available, as indicated by CPUID 
Fn8000_0001_ECX[SKINIT] = 0. 

X 

X 


Instruction is only recognized in protected mode. 

General protection, 
#GP 



X 

CPL was not zero. 
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SLDT Store Local Descriptor Table Register 

Stores the local descriptor table (LDT) selector to a register or memory destination operand. 

If the destination is a register, the selector is zero-extended into a 16-, 32-, or 64-bit general purpose 
register, depending on operand size. 

If the destination operand is a memory location, the segment selector is written to memory as a 16-bit 
value, regardless of operand size. 

This SLDT instruction can only be used in protected mode, but it can be executed at any privilege 
level. 


Mnemonic 

Opcode 

Description 

SLDT reg16 

OF 00 10 

Store the segment selector from the local 
descriptor table register to a 16-bit register. 

SLDT reg32 

OF 00 10 

Store the segment selector from the local 
descriptor table register to a 32-bit register. 

SLDT reg64 

OF 00 10 

Store the segment selector from the local 
descriptor table register to a 64-bit register. 

SLDT mem16 

OF 00 10 

Store the segment selector from the local 
descriptor table register to a 16-bit memory 


location. 

Related Instructions 

SIDT, SGDT, STR, LIDT, LGDT, LLDT, LTR 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 


This instruction is only recognized in protected mode. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 



X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 
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Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 


414 


SLDT 


System Instruction Reference 






24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


SMSW Store Machine Status Word 

Stores the lower bits of the machine status word (CRO). The target can be a 16-, 32-, or 64-bit register 
or a 16-bit memory operand. 

This instruction is provided for compatibility with early processors. 

This instruction can be used at any privilege level (CPL). 


Mnemonic 

Opcode 

Description 

SMSW reg16 

OF 01 /4 

Store the low 16 bits of CRO to a 16-bit register. 

SMSW reg32 

OF 01 14 

Store the low 32 bits of CRO to a 32-bit register. 

SMSW reg64 

OF 01 14 

Store the entire 64-bit CRO to a 64-bit register. 

SMSW mem16 

OF 01 14 

Store the low 16 bits of CRO to memory. 


Related Instructions 

LMSW, MOV CRn 


rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Stack, #SS 

X 

X 

X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 

X 

X 

X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 


X 

X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 


X 

X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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STAC Set Alignment Check Flag 

Sets the Alignment Check flag in the rFLAGS register to one. Support for the STAC instruction is 
indicated by CPUID Fn07_EBX[20] =1. For more information on using the CPUID instruction, see 
the description of the CPUID instruction on page 160. 


Description 

Sets the AC flag 

Related Instructions 

CLAC 

rFLAGS Affected 


Mnemonic Opcode 

STAC OF 01 CB 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 




1 














21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are 
blank. Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

Instruction not supported by CPUID 



X 


Instruction is not supported in virtual mode 




X 

Lock prefix (FOh) preceding opcode. 




X 

CPL was not 0 
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STI Set Interrupt Flag 

Sets the interrupt flag (IF) in the rFLAGS register to 1, thereby allowing external interrupts received 
on the INTR input. Interrupts received on the non-maskable interrupt (NMI) input are not affected by 
this instruction. 

In real mode, this instruction sets IF to 1. 

In protected mode and virtual-8086-mode, this instruction is IOPL-sensitive. If the CPL is less than or 
equal to the rFLAGS.IOPL field, the instruction sets IF to 1. 

In protected mode, if IOPL < 3, CPL = 3, and protected mode virtual interrupts are enabled 
(CR4.PVI = 1), then the instruction instead sets rFLAGS. VIF to 1. If none of these conditions apply, 
the processor raises a general protection exception (#GP). For more information, see “Protected Mode 
Virtual Interrupts” in Volume 2. 

In virtual-8086 mode, if IOPL < 3 and the virtual-8086-mode extensions are enabled (CR4.VME = 1), 
the STI instruction instead sets the virtual interrupt flag (rFLAGS.VIF) to 1. 

If STI sets the IF flag and IF was initially clear, then interrupts are not enabled until after the 
instruction following STL Thus, if IF is 0, this code will not allow an INTR to happen: 

STI 

CLI 

In the following sequence, INTR will be allowed to happen only after the NOP. 

STI 

NOP 

CLI 

If STI sets the VIF flag and VIP is already set, a #GP fault will be generated. 

See “Virtual-8086 Mode Extensions” in Volume 2 for more information about IOPL-sensitive 
instructions. 


Mnemonic 


Opcode 


Description 


STI 


FB 


Set interrupt flag (IF) to 1. 
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Action 

IF (CPL <= IOPL) 

RFLAGS.IF = 1 

ELSIF (((VIRTUAL_MODE) && (CR4.VME == 1)) 

| ((PROTECTED_MODE) && (CR4.PVI == 1) && (CPL == 3))) 

{ 

IF (RFLAGS.VIP == 1) 

EXCEPTION[#GP(0)] 

RFLAGS.VIF = 1 

} 

ELSE 

EXCEPTION[#GP (0) ] 

Related Instructions 

CLI 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 



M 








M 







21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. M (modified) is either set to one or cleared to zero. Unaffected flags are 
blank. Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 
#GP 


X 


The CPL was greater than the IOPL and virtual-mode 
extensions were not enabled (CR4.VME = 0). 



X 

The CPL was greater than the IOPL and either the CPL was 
not 3 or protected-mode virtual interrupts were not enabled 
(CR4.PVI = 0). 


X 

X 

This instruction would set RFLAGS.VIF to 1 and 

RFLAGS.VIP was already 1. 
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STGI Set Global Interrupt Flag 

Sets the global interrupt flag (GIF) to 1. While GIF is zero, all external interrupts are disabled. 

This is a Secure Virtual Machine (SVM) instruction. 

Attempted execution of this instruction causes a #UD exception if SVM is not enabled and neither 
SVM Lock nor the device exclusion vector (DEV) are supported. Support for SVM Lock is indicated 
by CPUID Fn8000_000A_EDX[SVML] = 1. Support for DEV is part of the SKINIT architecture and 
is indicated by CPUID Fn8000_0001_ECX[SKINIT] = 1. For more information on using the CPUID 
instruction, see the description of the CPUID instruction on page 160. 

For information on enabling SVM, see “Enabling SVM” in AMD64 Architecture Programmer s 
Manual Volume-2: System Instructions, order# 24593. 


Mnemonic Opcode Description 

STGI OF 01 DC Sets the global interrupt flag (GIF). 

Related Instructions 

CLGI 

rFLAGS Affected 

None. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, #UD 



X 

Secure Virtual Machine was not enabled (EFER.SVME=0) 
and both of the following conditions were true: 

• SVM Lock is not available, as indicated by 

CPUID Fn8000_000A_EDX[SVML] = 0. 

• DEV is not available, as indicated by 

CPUID Fn8000_0001_ECX[SKINIT] = 0. 

X 

X 


Instruction is only recognized in protected mode. 

General protection, 
#GP 



X 

CPL was not zero. 
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STR Store Task Register 

Stores the task register (TR) selector to a register or memory destination operand. 

If the destination is a register, the selector is zero-extended into a 16-, 32-, or 64-bit general purpose 
register, depending on the operand size. 

If the destination is a memory location, the segment selector is written to memory as a 16-bit value, 
regardless of operand size. 

The STR instruction can only be used in protected mode, but it can be used at any privilege level. 

Description 

Store the segment selector from the task register to a 16-bit 
general-purpose register. 

Store the segment selector from the task register to a 32-bit 
general-purpose register. 

Store the segment selector from the task register to a 64-bit 
general-purpose register. 

Store the segment selector from the task register to a 16-bit 
memory location. 

LGDT, LIDT, LLDT, LTR, SIDT, SGDT, SLDT 

rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, #UD 

X 

X 


This instruction is only recognized in protected mode. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 



X 

A memory address exceeded a data segment limit or was 
non-canonical. 



X 

The destination operand was in a non-writable segment. 



X 

A null data segment was used to reference memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, 

#AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 


Mnemonic 

Opcode 

STR reg16 

OF 00 /I 

STR reg32 

OF 00 /I 

STR reg64 

OF 00 /I 

STR mem16 

OF 00 /I 


Related Instructions 
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SWAPGS Swap GS Register with KerneIGSbase MSR 

Provides a fast method for system software to load a pointer to system data structures. SWAPGS can 
be used upon entering system-software routines as a result of a SYSCALL instruction, an interrupt or 
an exception. Prior to returning to application software, SWAPGS can be used to restore the 
application data pointer that was replaced by the system data-structure pointer. 

This instruction can only be executed in 64-bit mode. Executing SWAPGS in any other mode 
generates an undefined opcode exception. 

The SWAPGS instruction only exchanges the base-address value located in the KerneIGSbase model- 
specific register (MSR address C000_0102h) with the base-address value located in the hidden- 
portion of the GS selector register (GS.base). This allows the system-kernel software to access kernel 
data structures by using the GS segment-override prefix during memory references. 

The address stored in the KerneIGSbase MSR must be in canonical form. The WRMSR instruction 
used to load the KerneIGSbase MSR causes a general-protection exception if the address loaded is not 
in canonical fonn. The SWAPGS instruction itself does not perfonn a canonical check. 

This instruction is only valid in 64-bit mode at CPL 0. A general protection exception (#GP) is 
generated if this instruction is executed at any other privilege level. 

For additional information about this instruction, refer to “System-Management Instructions” in 
Volume 2. 

Examples 

At a kernel entry point, the OS uses SwapGS to obtain a pointer to kernel data structures and 
simultaneously save the user's GS base. Upon exit, it uses SwapGS to restore the user's GS base: 

SystemCallEntryPoint: 

SwapGS 

mov gs:[SavedUserRSP], rsp 
mov rsp, gs:[KernelStackPtr] 
push rax 

SwapGS ; restore user GS, save kernel pointer 


; get kernel pointer, save user GSbase 
; save user's stack pointer 
; set up kernel stack 
; now save user GPRs on kernel stack 
; perform system service 


Description 

Exchange GS base with KerneIGSBase MSR. 
(Invalid in legacy and compatibility modes.) 

Related Instructions 

None 


Mnemonic Opcode 

SWAPGS OF 01 F8 


System Instruction Reference 


SWAPGS 


421 



AMpg 

AMD64 Technology 


24594 — Rev. 3.28—September 2019 


rFLAGS Affected 

None 

Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

This instruction was executed in legacy or 
compatibility mode. 

General protection, #GP 



X 

CPL was not 0. 
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SYSCALL Fast System Call 

Transfers control to a fixed entry point in an operating system. It is designed for use by system and 
application software implementing a flat-segment memory model. 

The SYSCALL and SYSRET instructions are low-latency system call and return control-transfer 
instructions, which assume that the operating system implements a flat-segment memory model. By 
eliminating unneeded checks, and by loading pre-determined values into the CS and SS segment 
registers (both visible and hidden portions), calls to and returns from the operating system are greatly 
simplified. These instructions can be used in protected mode and are particularly well-suited for use in 
64-bit mode, which requires implementation of a paged, flat-segment memory model. 

This instruction has been optimized by reducing the number of checks and memory references that are 
normally made so that a call or return takes considerably fewer clock cycles than the CALL FAR /RET 
FAR instruction method. 

It is assumed that the base, limit, and attributes of the Code Segment will remain flat for all processes 
and for the operating system, and that only the current privilege level for the selector of the calling 
process should be changed from a current privilege level of 3 to a new privilege level of 0. It is also 
assumed (but not checked) that the RPL of the SYSCALL and SYSRET target selectors are set to 0 
and 3, respectively. 

SYSCALL sets the CPL to 0, regardless of the values of bits 33:32 of the STAR register. There are no 
permission checks based on the CPL, real mode, or virtual-8086 mode. SYSCALL and SYSRET must 
be enabled by setting EFER.SCE to 1. 

It is the responsibility of the operating system to keep the descriptors in memory that correspond to the 
CS and SS selectors loaded by the SYSCALL and SYSRET instructions consistent with the segment 
base, limit, and attribute values forced by these instructions. 

Legacy x86 Mode. In legacy x86 mode, when SYSCALL is executed, the EIP of the instruction 
following the SYSCALL is copied into the ECX register. Bits 31:0 of the SYSCALL/SYSRET target 
address register (STAR) are copied into the EIP register. (The STAR register is model-specific register 
C000_0081h.) 

New selectors are loaded, without permission checking (see above), as follows: 

• Bits 47:32 of the STAR register specify the selector that is copied into the CS register. 

• Bits 47:32 of the STAR register + 8 specify the selector that is copied into the SS register. 

• The CS base and the SS base are both forced to zero. 

• The CSlimit and the SSlimit are both forced to 4 Gbyte. 

• The CS segment attributes are set to execute/read 32-bit code with a CPL of zero. 

• The SS segment attributes are set to read/write and expand-up with a 32-bit stack referenced by 
ESP. 
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Long Mode. When long mode is activated, the behavior of the SYSCALL instruction depends on 
whether the calling software is in 64-bit mode or compatibility mode. In 64-bit mode, SYSCALL 
saves the RIP of the instruction following the SYSCALL into RCX and loads the new RIP from 
LSTAR bits 63:0. (The LSTAR register is model-specific register C000_0082h.) In compatibility 
mode, SYSCALL saves the RIP of the instruction following the SYSCALL into RCX and loads the 
new RIP from CSTAR bits 63:0. (The CSTAR register is model-specific register C000_0083h.) 

New selectors are loaded, without permission checking (see above), as follows: 

• Bits 47:32 of the STAR register specify the selector that is copied into the CS register. 

• Bits 47:32 of the STAR register + 8 specify the selector that is copied into the SS register. 

• The CS base and the SS base are both forced to zero. 

• The CSlimit and the SSlimit are both forced to 4 Gbyte. 

• The CS segment attributes are set to execute/read 64-bit code with a CPL of zero. 

• The SS segment attributes are set to read/write and expand-up with a 64-bit stack referenced by 
RSP. 

The WRMSR instruction loads the target RIP into the LSTAR and CSTAR registers. If an RIP written 
by WRMSR is not in canonical form, a general-protection exception (#GP) occurs. 

How SYSCALL and SYSRET handle rFLAGS, depends on the processor’s operating mode. 

In legacy mode, SYSCALL treats EFLAGS as follows: 

• EFLAGS.IF is cleared to 0. 

• EFLAGS.RF is cleared to 0. 

• EFLAGS.VM is cleared to 0. 

In long mode, SYSCALL treats RFLAGS as follows: 

• The current value of RFLAGS is saved in R11. 

• RFLAGS is masked using the value stored in SYSCALL FLAG MASK. 

• RFLAGS.RF is cleared to 0. 

For further details on the SYSCALL and SYSRET instructions and their associated MSR registers 
(STAR, LSTAR, CSTAR, and SYSCALL FLAG MASK), see “Fast System Call and Return” in 
Volume 2. 

Support for the SYSCALL instruction is indicated by CPUID Fn8000_0001_EDX[SysCallSysRet] = 
1. For more information on using the CPUID instruction, see the description of the CPUID instruction 
on page 160. 
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Instruction Encoding 

Mnemonic Opcode Description 

SYSCALL OF 05 Call operating system. 

Action 

// See "Pseudocode Definition" on page 57. 

SYSCALL^START: 

IF (MSR EFER.SCE == 0) // Check if syscall/sysret are enabled. 

EXCEPTION [#UD] 

IF (LONGJMODE) 

SYSCALL_LONG_MODE 
ELSE // (LEGACY_MODE) 

SYSCALL LEGACY MODE 


SYSCALL LONG MODE: 


RCX.q = next_RIP 

Rll.q = RFLAGS // with rf cleared 


IF (64BIT_MODE) 

temp_RIP.q = MSR_LSTAR 
ELSE // (COMPATIBILITY_MODE) 
temp_RIP.q = MSR_CSTAR 


CS.sel 
CS.attr 
CS.base 
CS.limit 


MSR_STAR.SYSCALL_CS AND OxFFFC 

64-bit code,dpl0 // Always switch to 64-bit mode in long mode. 

0x00000000 

OxFFFFFFFF 


SS.sel 
SS.attr 
SS.base 
SS.limit 


MSRJSTAR.SYSCALL^CS + 8 
64-bit stack,dplO 
0x00000000 
OxFFFFFFFF 


RFLAGS = RFLAGS AND ~MSR_SFMASK 
RFLAGS.RF = 0 


CPL = 0 


RIP = temp RIP 
EXIT 


SYSCALL LEGACY MODE: 
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RCX.d = next RIP 


temp_RIP.d = MSR^STAR.EIP 


CS.sel 
CS.attr 
CS.base 
CS.limit 


MSR_STAR.SYSCALL_CS AND OxFFFC 

32-bit code,dplO // Always switch to 32-bit mode in legacy mode. 

0x00000000 

OxFFFFFFFF 


SS.sel 
SS.attr 
SS.base 
SS.limit 


MSR_STAR.SYSCALL_CS + 8 
32-bit stack,dplO 
0x00000000 
OxFFFFFFFF 


RFLAGS.VM,IF,RF=0 


CPL = 0 


RIP = temp RIP 
EXIT 

Related Instructions 

SYSRET, SYSENTER, SYSEXIT 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 

M 

M 

M 

M 

0 

0 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to one or cleared to zero is M (modified). Unaffected flags 
are blank. Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

The SYSCALL and SYSRET instructions are not 
supported, as indicated by CPU ID 
Fn8000_0001_EDX[SysCallSysRet] = 0. 

X 

X 

X 

The system call extension bit (SCE) of the extended 
feature enable register (EFER) is set to 0. (The 

EFER register is MSR C000_0080h.) 
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SYSENTER System Call 

Transfers control to a fixed entry point in an operating system. It is designed for use by system and 
application software implementing a flat-segment memory model. This instruction is valid only in 
legacy mode. 

Three model-specific registers (MSRs) are used to specify the target address and stack pointers for the 
SYSENTER instruction, as well as the CS and SS selectors of the called and returned procedures: 

• MSR SYSENTER CS: Contains the CS selector of the called procedure. The SS selector is set to 
MSR_SYSENTER_CS + 8. 

• MSR SYSENTER ESP: Contains the called procedure’s stack pointer. 

• MSRSYSENTEREIP: Contains the offset into the CS of the called procedure. 

The hidden portions of the CS and SS segment registers are not loaded from the descriptor table as 
they would be using a legacy x86 CALL instruction. Instead, the hidden portions are forced by the 
processor to the following values: 

• The CS and SS base values are forced to 0. 

• The CS and SS limit values are forced to 4 Gbytes. 

• The CS segment attributes are set to execute/read 32-bit code with a CPL of zero. 

• The SS segment attributes are set to read/write and expand-up with a 32-bit stack referenced by 
ESP. 

System software must create corresponding descriptor-table entries referenced by the new CS and SS 
selectors that match the values described above. 

The return EIP and application stack are not saved by this instruction. System software must explicitly 
save that infonnation. 

An invalid-opcode exception occurs if this instruction is used in long mode. Software should use the 
SYSCALL (and SYSRET) instructions in long mode. If SYSENTER is used in real mode, a #GP is 
raised. 

For additional infonnation on this instruction, see “SYSENTER and SYSEXIT (Legacy Mode Only)” 
in Volume 2. 

Support for the SYSENTER instruction is indicated by CPUID Fn0000_0001_EDX[SysEnterSysExit] 
= 1. For more information on using the CPUID instruction, see the description of the CPUID 
instruction on page 160. 

Instruction Encoding 

Mnemonic Opcode Description 

SYSENTER OF 34 Call operating system. 
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Related Instructions 

SYSCALL, SYSEXIT, SYSRET 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 





0 






0 







21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to one or zero is M (modified). Unaffected flags are blank. 
Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

The SYSENTER and SYSEXIT instructions are not 
supported, as indicated by 

CPUID Fn0000_0001_EDX[SysEnterSysExit] = 0. 



X 

This instruction is not recognized in long mode. 

General protection, #GP 

X 



This instruction is not recognized in real mode. 


X 

X 

MSR_SYSENTER_CS was a null selector. 
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SYSEXIT System Return 

Returns from the operating system to an application. It is a low-latency system return instruction 
designed for use by system and application software implementing a flat-segment memory model. 

This is a privileged instruction. The current privilege level must be zero to execute this instruction. An 
invalid-opcode exception occurs if this instruction is used in long mode. Software should use the 
SYSRET (and SYSCALL) instructions when running in long mode. 

When a system procedure performs a SYSEXIT back to application software, the CS selector is 
updated to point to the second descriptor entry after the SYSENTER CS value (MSR 
SYSENTER CS+16). The SS selector is updated to point to the third descriptor entry after the 
SYSENTER CS value (MSR SYSENTER_CS+24). The CPL is forced to 3, as are the descriptor 
privilege levels. 

The hidden portions of the CS and SS segment registers are not loaded from the descriptor table as 
they would be using a legacy x86 RET instruction. Instead, the hidden portions are forced by the 
processor to the following values: 

• The CS and SS base values are forced to 0. 

• The CS and SS limit values are forced to 4 Gbytes. 

• The CS segment attributes are set to 32-bit read/execute at CPL 3. 

• The SS segment attributes are set to read/write and expand-up with a 32-bit stack referenced by 
ESP. 

System software must create corresponding descriptor-table entries referenced by the new CS and SS 
selectors that match the values described above. 

The following additional actions result from executing SYSEXIT: 

• EIP is loaded from EDX. 

• ESP is loaded from ECX. 

System software must explicitly load the return address and application software-stack pointer into the 
EDX and ECX registers prior to executing SYSEXIT. 

For additional infonnation on this instruction, see “SYSENTER and SYSEXIT (Legacy Mode Only)” 
in Volume 2. 

Support for the SYSEXIT instruction is indicated by CPUID Fn0000_0001_EDX[SysEnterSysExit] = 
1. For more information on using the CPUID instruction, see the description of the CPUID instruction 
on page 160. 
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Instruction Encoding 


Mnemonic 

SYSEXIT 


Opcode Description 

OF 35 Return from operating system to application. 


Related Instructions 

SYSCALL, SYSENTER, SYSRET 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 






0 












21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to one or cleared to zero is M (modified). Unaffected flags are 
blank. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

The SYSENTER and SYSEXIT instructions are not 
supported, as indicated by 

CPUID Fn0000_0001_EDX[SysEnterSysExit] = 0. 



X 

This instruction is not recognized in long mode. 

General protection, #GP 

X 

X 


This instruction is only recognized in protected 
mode. 



X 

CPL was not 0. 



X 

MSR SYSENTER CS was a null selector. 
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SYSRET Fast System Return 

Returns from the operating system to an application. It is a low-latency system return instruction 
designed for use by system and application software implementing a flat segmentation memory model. 

The SYSCALL and SYSRET instructions are low-latency system call and return control-transfer 
instructions that assume that the operating system implements a flat-segment memory model. By 
eliminating unneeded checks, and by loading pre-determined values into the CS and SS segment 
registers (both visible and hidden portions), calls to and returns from the operating system are greatly 
simplified. These instructions can be used in protected mode and are particularly well-suited for use in 
64-bit mode, which requires implementation of a paged, flat-segment memory model. 

This instruction has been optimized by reducing the number of checks and memory references that are 
normally made so that a call or return takes substantially fewer internal clock cycles when compared to 
the CALL/RET instruction method. 

It is assumed that the base, limit, and attributes of the Code Segment will remain flat for all processes 
and for the operating system, and that only the current privilege level for the selector of the calling 
process should be changed from a current privilege level of 0 to a new privilege level of 3. It is also 
assumed (but not checked) that the RPL of the SYSCALL and SYSRET target selectors are set to 0 
and 3, respectively. 

SYSRET sets the CPL to 3, regardless of the values of bits 49:48 of the star register. SYSRET can only 
be executed in protected mode at CPL 0. SYSCALL and SYSRET must be enabled by setting 
EFER.SCE to 1. 

It is the responsibility of the operating system to keep the descriptors in memory that correspond to the 
CS and SS selectors loaded by the SYSCALL and SYSRET instructions consistent with the segment 
base, limit, and attribute values forced by these instructions. 

When a system procedure performs a SYSRET back to application software, the CS selector is 
updated from bits 63:50 of the STAR register (STAR.SYSRETCS) as follows: 

• If the return is to 32-bit mode (legacy or compatibility), CS is updated with the value of 
STAR.SYSRETCS. 

• If the return is to 64-bit mode, CS is updated with the value of STAR.SYSRETCS +16. 

In both cases, the CPL is forced to 3, effectively ignoring STAR bits 49:48. The SS selector is updated 
to point to the next descriptor-table entry after the CS descriptor (STAR.SYSRET CS + 8), and its 
RPL is not forced to 3. 

The hidden portions of the CS and SS segment registers are not loaded from the descriptor table as 
they would be using a legacy x86 RET instruction. Instead, the hidden portions are forced by the 
processor to the following values: 

• The CS base value is forced to 0. 

• The CS limit value is forced to 4 Gbytes. 


System Instruction Reference 


SYSRET 


431 



AMpg 

AMD64 Technology 


24594 — Rev. 3.28—September 2019 


• The CS segment attributes are set to execute-read 32 bits or 64 bits (see below). 

• The SS segment base, limit, and attributes are not modified. 

When SYSCALLed system software is running in 64-bit mode, it has been entered from either 64-bit 
mode or compatibility mode. The corresponding SYSRET needs to know the mode to which it must 
return. Executing SYSRET in non-64-bit mode or with a 16- or 32-bit operand size returns to 32-bit 
mode with a 32-bit stack pointer. Executing SYSRET in 64-bit mode with a 64-bit operand size returns 
to 64-bit mode with a 64-bit stack pointer. 

The instruction pointer is updated with the return address based on the operating mode in which 
SYSRET is executed: 

• If returning to 64-bit mode, SYSRET loads RIP with the value of RCX. 

• If returning to 32-bit mode, SYSRET loads EIP with the value of ECX. 

How SYSRET handles RFLAGS depends on the processor’s operating mode: 

• If executed in 64-bit mode, SYSRET loads the lower-32 RFLAGS bits from R11 [31:0] and clears 
the upper 32 RFLAGS bits. 

• If executed in legacy mode or compatibility mode, SYSRET sets EFLAGS.IF. 

For further details on the SYSCALL and SYSRET instructions and their associated MSR registers 
(STAR, LSTAR, and CSTAR), see “Fast System Call and Return” in Volume 2. 

Support for the SYSRET instruction is indicated by CPUID Fn8000_0001_EDX[SysCallSysRet] = 1. 
For more infonnation on using the CPUID instruction, see the description of the CPUID instruction on 
page 160. 

instruction Encoding 

Mnemonic Opcode Description 

SYSRET OF 07 Return from operating system. 

Action 

// See "Pseudocode Definition" on page 57. 

SYSRET_START: 

IF (MSR EFER.SCE == 0) // Check if syscall/sysret are enabled. 

EXCEPTION [#UD] 


IF ((!PROTECTED_MODE) || (CPL != 0)) 

EXCEPTION [#GP(0)] // SYSRET requires protected mode, cplO 

IF (64BIT_MODE) 

SYSRET_64BIT_MODE 
ELSE // (!64BIT MODE) 
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SYSRET_NON_64BIT_MODE 
SYSRET_64BIT_MODE: 

IF (OPERAND_SIZE == 64) // Return to 64-bit mode. 

{ 

CS.sel = (MSR_STAR.SYSRET_CS +16) OR 3 
CS.base = 0x00000000 
CS.limit = OxFFFFFFFF 
CS.attr = 64-bit code,dpl3 

temp_RIP.q = RCX 

} 

ELSE // Return to 32-bit compatibility mode. 

{ 

CS.sel = MSR_STAR.SYSRET_CS OR 3 
CS.base = 0x00000000 
CS.limit = OxFFFFFFFF 
CS.attr = 32-bit code,dpl3 

temp_RIP.d = RCX 

} 

SS.sel = MSR_STAR.SYSRET_CS +8 // SS selector is changed, 

// SS base, limit, attributes unchanged. 

RFLAGS.q = Rll // RF=0,VM=0 

CPL = 3 

RIP = temp RIP 
EXIT 

SYSRET_NON_64BIT_MODE: 

CS.sel = MSR_STAR.SYSRET_CS OR 3 // Return to 32-bit legacy protected mode. 
CS.base = 0x00000000 
CS.limit = OxFFFFFFFF 
CS.attr = 32-bit code,dpl3 

temp_RIP.d = RCX 

SS.sel = MSR_STAR.SYSRET_CS +8 // SS selector is changed. 

// SS base, limit, attributes unchanged. 

RFLAGS.IF = 1 
CPL = 3 

RIP = temp RIP 
EXIT 
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Related Instructions 

SYSCALL, SYSENTER, SYSEXIT 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 

M 

M 

M 

M 


0 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

M 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to one or cleared to zero is M (modified). Unaffected flags 
are blank. Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protected 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

The SYSCALL and SYSRET instructions are not 
supported, as indicated by CPU ID 
Fn8000_0001_EDX[SysCallSysRet] = 0. 

X 

X 

X 

The system call extension bit (SCE) of the extended 
feature enable register (EFER) is set to 0. (The 

EFER register is MSR C000_0080h.) 

General protection, #GP 

X 

X 


This instruction is only recognized in protected 
mode. 



X 

CPL was not 0. 
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VERR Verify Segment for Reads 

Verifies whether a code or data segment specified by the segment selector in the 16-bit register or 
memory operand is readable from the current privilege level. The zero flag (ZF) is set to 1 if the 
specified segment is readable. Otherwise, ZF is cleared. 

A segment is readable if all of the following apply: 

• the selector is not a null selector. 

• the descriptor is within the GDT or LDT limit. 

• the segment is a data segment or readable code segment. 

• the descriptor DPL is greater than or equal to both the CPL and RPL, or the segment is a 
confonning code segment. 

The processor does not recognize the VERR instruction in real or virtual-8086 mode. 

Mnemonic Opcode Description 

VERR regale OF 00M SSdS.'&S2'° 1 ^ 


Related Instructions 

ARPL, LAR, LSL, VERW 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 














M 




21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 1 5, 5, 3, and 1 are reserved. A flag set to one or cleared to zero is M (modified). Unaffected flags are 
blank. Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 


This instruction is only recognized in protected mode. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or is 
non-canonical. 

General protection, 
#GP 



X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to reference memory. 


System Instruction Reference 


VERR 


435 








AMpg 

AMD64 Technology 


24594 — Rev. 3.28—September 2019 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 
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VERW Verify Segment for Write 

Verifies whether a data segment specified by the segment selector in the 16-bit register or memory 
operand is writable from the current privilege level. The zero flag (ZF) is set to 1 if the specified 
segment is writable. Otherwise, ZF is cleared. 

A segment is writable if all of the following apply: 

• the selector is not a null selector. 

• the descriptor is within the GDT or LDT limit. 

• the segment is a writable data segment. 

• the descriptor DPL is greater than or equal to both the CPL and RPL. 

The processor does not recognize the VERW instruction in real or virtual-8086 mode. 

Mnemonic Opcode 

VERW reg/mem16 OF 00 15 


Related Instructions 

ARPL, LAR, LSL, VERR 

rFLAGS Affected 


ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 














M 




21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

Note: Bits 31:22, 15, 5, 3, and 1 are reserved. A flag set to one or cleared to zero is M (modified). Unaffected flags are 
blank. Undefined flags are U. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 


This instruction is only recognized in protected mode. 

Stack, #SS 



X 

A memory address exceeded the stack segment limit or was 
non-canonical. 

General protection, 
#GP 



X 

A memory address exceeded a data segment limit or was non- 
canonical. 



X 

A null data segment was used to access memory. 

Page fault, #PF 



X 

A page fault resulted from the execution of the instruction. 

Alignment check, 
#AC 



X 

An unaligned memory reference was performed while 
alignment checking was enabled. 


Description 

Set the zero flag (ZF) to 1 if the segment 
selected can be written. 
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VMLOAD Load State from VMCB 

Loads a subset of processor state from the VMCB specified by the system-physical address in the rAX 
register. The portion of RAX used to fonn the address is determined by the effective address size. 

The VMSAVE and VMLOAD instructions complement the state save/restore abilities of VMRUN and 
#VMEXIT, providing access to hidden state that software is otherwise unable to access, plus some 
additional commonly-used state. 

This is a Secure Virtual Machine (SVM) instruction. Support for the SVM architecture and the SVM 
instructions is indicated by CPUID Fn8000_0001_ECX[SVM] = 1. For more information on using the 
CPUID instruction, see the reference page for the CPUID instruction on page 160. 

This instruction generates a #UD exception if SVM is not enabled. See “Enabling SVM” in AMD64 
Architecture Programmer s Manual Volume 2: System Instructions, order# 24593. 

Mnemonic Opcode Description 

VMLOAD rAX OF 01 DA Load additional state from VMCB. 

Action 

IF ((MSR_EFER.SVME == 0) || (!PROTECTED_MODE)) 

EXCEPTION [#UD] // This instruction can only be executed in protected 

// mode with SVM enabled 

IF (CPL != 0) // This instruction is only allowed at CPL 0 

EXCEPTION [#GP] 

IF (rAX contains an unsupported system-physical address) 

EXCEPTION [#GP] 

Load from a VMCB at system-physical address rAX: 

FS, GS, TR, LDTR (including all hidden state) 

KernelGsBase 

STAR, LSTAR, CSTAR, SFMASK 
SYSENTER_CS, SYSENTER_ESP, SYSENTER_EIP 

Related Instructions 

VMSAVE 

rFLAGS Affected 

None. 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

The SVM instructions are not supported as indicated by 
CPUID Fn8000_0001_ECX[SVM] = 0. 



X 

Secure Virtual Machine was not enabled (EFER.SVME=0). 

X 

X 


The instruction is only recognized in protected mode. 

General protection, 
#GP 



X 

CPL was not zero. 



X 

rAX referenced a physical address above the maximum 
supported physical address. 



X 

The address in rAX was not aligned on a 4Kbyte boundary. 
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VMMCALL Call VMM 

Provides a mechanism for a guest to explicitly communicate with the VMM by generating a 
#VMEXIT. 

A non-intercepted VMMCALL unconditionally raises a #UD exception. 

VMMCALL is not restricted to either protected mode or CPL zero. 

This is a Secure Virtual Machine (SVM) instruction. Support for the SVM architecture and the SVM 
instructions is indicated by CPUID Fn8000_0001_ECX[SVM] = 1. For more information on using the 
CPUID instruction, see the reference page for the CPUID instruction on page 160. 

This instruction generates a #UD exception if SVM is not enabled. See “Enabling SVM” in AMD64 
Architecture Programmer s Manual Volume 2: System Instructions, order# 24593. 

Mnemonic Opcode Description 

VMMCALL OF 01 D9 Explicit communication with the VMM. 

Related Instructions 

None. 

rFLAGS Affected 

None. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

The SVM instructions are not supported as indicated by 
CPUID Fn8000_0001_ECX[SVM] = 0. 

X 

X 

X 

Secure Virtual Machine was not enabled (EFER.SVME=0). 

X 

X 

X 

VMMCALL was not intercepted. 
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VMRUN Run Virtual Machine 

Starts execution of a guest instruction stream. The physical address of the virtual machine control 
block (VMCB) describing the guest is taken from the rAX register (the portion of RAX used to form 
the address is determined by the effective address size). The physical address of the VMCB must be 
aligned on a 4K-byte boundary. 

VMRUN saves a subset of host processor state to the host state-save area specified by the physical 
address in the VM_HSAVE_PA MSR. VMRUN then loads guest processor state (and control 
information) from the VMCB at the physical address specified in rAX. The processor then executes 
guest instructions until one of several intercept events (specified in the VMCB) is triggered. When an 
intercept event occurs, the processor stores a snapshot of the guest state back into the VMCB, reloads 
the host state, and continues execution of host code at the instruction following the VMRUN 
instruction. 

This is a Secure Virtual Machine (SVM) instruction. Support for the SVM architecture and the SVM 
instructions is indicated by CPUID Fn8000_0001_ECX[SVM] = 1. For more information on using the 
CPUID instruction, see the reference page for the CPUID instruction on page 160. 

This instruction generates a #UD exception if SVM is not enabled. See “Enabling SVM” in AMD64 
Architecture Programmer s Manual Volume 2: System Instructions, order# 24593. 

The VMRUN instruction is not supported in System Management Mode. Processor behavior resulting 
from an attempt to execute this instruction from within the SMM handler is undefined. 

instruction Encoding 

Mnemonic Opcode Description 

VMRUN rAX OF 01 D8 Performs a world-switch to guest. 

Action 

IF ((MSR_EFER.SVME == 0) || (!PROTECTED_MODE)) 

EXCEPTION [#UD] // This instruction can only be executed in protected 

// mode with SVM enabled 

IF (CPL != 0) // This instruction is only allowed at CPL 0 

EXCEPTION [#GP] 

IF (rAX contains an unsupported physical address) 

EXCEPTION [#GP] 

IF (intercepted(VMRUN)) 

#VMEXIT (VMRUN) 

remember VMCB address (delivered in rAX) for next #VMEXIT 

save host state to physical memory indicated in the VM HSAVE PA MSR: 

ES.sel 
CS.sel 
SS.sel 
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DS.sel 

GDTR.{base,limit} 

IDTR.{base,limit} 

EFER 

CRO 

CR4 

CR3 

// host CR2 is not saved 

RFLAGS 

RIP 

RSP 

RAX 

from the VMCB at physical address rAX, load control information: 
intercept vector 
TSC_OFFSET 

interrupt control (v irq, v__intr_*, v tpr) 

EVENTINJ field 
ASID 

IF(nested paging supported) 

NP_ENABLE 

IF (NP_ENABLE == 1) 
nCR3 


from the VMCB at physical address rAX, load guest state: 

ES.{base,limit,attr,sel} 

CS.{base,limit,attr,sel} 

SS.{base,limit,attr,sel} 

DS.{base,limit,attr,sel} 

GDTR.{base,limit} 

IDTR.{base,limit} 

EFER 

CRO 

CR4 

CR3 

CR2 

IF (NP_ENABLE == 1) 

gPAT // Leaves host hPAT register unchanged. 

RFLAGS 

RIP 

RSP 

RAX 

DR7 

DR6 

CPL // 0 for real mode, 3 for v86 mode, else as loaded. 

INTERRUPT_SHADOW 

IF (LBR virtualization supported) 

LBR_VIRTUALIZATION_ENABLE 

IF (LBR VIRTUALIZATION ENABLE == 1) 
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save LBR state to the host save area 
DBGCTL 
BR_FROM 
BR_TO 

LASTEXCP^FROM 

LASTEXCP_TO 

load LBR state from the VMCB 
DBGCTL 
BR_FROM 
BR_TO 

LASTEXCP^FROM 

LASTEXCP_TO 

IF (guest state consistency checks fail) 

#VMEXIT(INVALID) 

Execute command stored in TLB_CONTROL. 

GIF =1 // allow interrupts in the guest 

IF (EVENTINJ.V) 

cause exception/interrupt in guest 

else 

jump to first guest instruction 

Upon #VMEXIT, the processor performs the following actions in order to return to the host execution 
context: 

GIF = 0 

save guest state to VMCB: 

ES.{base,limit,attr,sel} 

CS.{base,limit,attr,sel} 

SS.{base,limit,attr,sel} 

DS.{base,limit,attr,sel} 

GDTR.{base,limit} 

IDTR.{base,limit} 

EFER 

CR4 

CR3 

CR2 

CRO 

if (nested paging enabled) 
gPAT 
RFLAGS 
RIP 
RSP 
RAX 
DR7 
DR6 
CPL 

INTERRUPT_SHADOW 

save additional state and intercept information: 

V IRQ, V_TPR 
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EXITCODE 

EXITINFOl 

EXITINF02 

EXITINTINFO 

clear EVENTINJ field in VMCB 

prepare for host mode by clearing internal processor state bits: 
clear intercepts 
clear v^irq 
clear v intr masking 
clear tsc_offset 
disable nested paging 
clear ASID to zero 

reload host state 

GDTR.{base,limit} 

IDTR.{base,limit} 

EFER 

CRO 

CRO.PE =1 // saved copy of CRO.PE is ignored 

CR4 

CR3 

if (host is in PAE paging mode) 
reloaded host PDPEs 
// Do not reload host CR2 or PAT 
RFLAGS 
RIP 
RSP 
RAX 

DR7 = "all disabled" 

CPL = 0 

ES.sel; reload segment descriptor from GDT 
CS.sel; reload segment descriptor from GDT 
SS.sel; reload segment descriptor from GDT 
DS.sel; reload segment descriptor from GDT 

if (LBR virtualization supported) 

LBR_VIRTUALIZATION_ENABLE 
if (LBR_VIRTUALIZATION_ENABLE == 1) 
save LBR state to the VMCB: 

DBGCTL 

BR_FROM 

BR_TO 

LASTEXCP_FROM 

LASTEXCP_TO 

load LBR state from the host save area: 

DBGCTL 

BR_FROM 

BR_TO 

LASTEXCP_FROM 
LASTEXCP TO 
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if (illegal host state loaded, or exception while loading host state) 
shutdown 

else 

execute first host instruction following the VMRUN 

Related Instructions 

VMLOAD, VMSAVE. 

rFLAGS Affected 

None. 


Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

The SVM instructions are not supported as indicated by 
CPUID Fn8000_0001_ECX[SVM] = 0. 



X 

Secure Virtual Machine was not enabled (EFER.SVME=0). 

X 

X 


The instruction is only recognized in protected mode. 

General protection, 
#GP 



X 

CPL was not zero. 



X 

rAX referenced a physical address above the maximum 
supported physical address. 



X 

The address in rAX was not aligned on a 4Kbyte boundary. 
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VMSAVE Save State to VMCB 

Stores a subset of the processor state into the VMCB specified by the system-physical address in the 
rAX register (the portion of RAX used to form the address is determined by the effective address size). 

The VMSAVE and VMLOAD instructions complement the state save/restore abilities of VMRUN and 
#VMEXIT, providing access to hidden state that software is otherwise unable to access, plus some 
additional commonly-used state. 

This is a Secure Virtual Machine (SVM) instruction. Support for the SVM architecture and the SVM 
instructions is indicated by CPUID Fn8000_0001_ECX[SVM] = 1. For more information on using the 
CPUID instruction, see the reference page for the CPUID instruction on page 160. 

This instruction generates a #UD exception if SVM is not enabled. See “Enabling SVM” in AMD64 
Architecture Programmer s Manual Volume 2: System Instructions, order# 24593. 

Instruction Encoding 

Mnemonic Opcode Description 

VMSAVE rAX OF 01 DB Save additional guest state to VMCB. 

Action 

IF ((MSR_EFER.SVME == 0) || (!PROTECTED_MODE)) 

EXCEPTION [#UD] // This instruction can only be executed in protected 

// mode with SVM enabled 

IF (CPL != 0) // This instruction is only allowed at CPL 0 

EXCEPTION [#GP] 

IF (rAX contains an unsupported system-physical address) 

EXCEPTION [#GP] 

Store to a VMCB at system-physical address rAX: 

FS, GS, TR, LDTR (including all hidden state) 

KernelGsBase 

STAR, LSTAR, CSTAR, SFMASK 
SYSENTER_CS, SYSENTER_ESP, SYSENTER_EIP 

Related Instructions 

VMLOAD 

rFLAGS Affected 

None. 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, #UD 

X 

X 

X 

The SVM instructions are not supported as indicated by 
CPUID Fn8000_0001_ECX[SVM] = 0. 



X 

Secure Virtual Machine was not enabled (EFER.SVME=0). 

X 

X 


The instruction is only recognized in protected mode. 

General protection, 
#GP 



X 

CPL was not zero. 



X 

rAX referenced a physical address above the maximum 
supported physical address. 



X 

The address in rAX was not aligned on a 4Kbyte boundary. 
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WBINVD Writeback and Invalidate Caches 

WBNOINVD Writeback With No Invalidate 

WBINVD writes all modified lines in all levels of cache associated with this processor to main 
memory and invalidates the caches. This may or may not include lower level caches associated with 
another processor that shares any level of this processor's cache hierarchy. WBNOINVD does not 
invalidate the caches, instead leaving all (or most) cache lines in the cache hierarchy in non-modi lied 
state, but in all other respects it behaves the same as WBINVD. 

CPUID FnSOOO OO I D_EDX[WBINVD]_xV indicates the behavior of the operation at various levels 
of the cache hierarchy, for both WBINVD and WBNOINVD, with respect to lower branches in the 
cache hierarchy. If the feature bit is 0, the instruction causes the write back and (for WBINVD) 
invalidation of all lower level caches of other processors sharing the designated level of cache. If the 
feature bit is 1, the instruction does not necessarily cause the write back and invalidation of all lower 
level caches of other processors sharing the designated level of cache. See Appendix E, “Obtaining 
Processor Information Via the CPUID Instruction,” on page 607 for more information on using the 
CPUID function. 

The INVD instruction can be used when cache coherence with memory is not important. 

These instructions do not invalidate TLB caches. 

These are privileged instructions. The current privilege level of a procedure invalidating the 
processor’s internal caches must be zero. 

WBINVD and WBNOINVD are serializing instructions 

Support for WBNOINVD is indicated by CPUID Fn8000_0008_EBX[WBNOINVD] = 1. However, 
the encoding of WBNOINVD results in it being interpreted as WBINVD on processors that do not 
explicitly support WBNOINVD, including legacy processors. For more information on using the 
CPUID instruction, see the description of the CPUID instruction on page 160. 


Mnemonic Opcode 

WBINVD OF 09 

WBNOINVD F3 OF 09 

Related Instructions 

CLFLUSH, CLWB, INVD 

rFLAGS Affected 

None 


Description 

Write modified cache lines to main memory, invalidate 
internal caches, and trigger external cache flushes. 

Write modified cache lines to main memory and trigger 
external cache flushes. 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

General protection, 
#GP 


X 

X 

CPL was not 0. 
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WRMSR Write to Model-Specific Register 

Writes data to 64-bit model-specific registers (MSRs). These registers are widely used in 
performance-monitoring and debugging applications, as well as testability and program execution 
tracing. 

This instruction writes the contents of the EDX:EAX register pair into a 64-bit model-specific register 
specified in the ECX register. The 32 bits in the EDX register are mapped into the high-order bits of 
the model-specific register and the 32 bits in EAX form the low-order 32 bits. 

This instruction must be executed at a privilege level of 0 or a general protection fault #GP(0) will be 
raised. This exception is also generated if an attempt is made to specify a reserved or unimplemented 
model-specific register in ECX. 

WRMSR is a serializing instruction. 

Support for the WRMSR instruction is indicated by CPUID Fn0000_0001_EDX[MSR] = 1 OR 
CPUID Fn8000_0001_EDX[MSR] = 1. For more infonnation on using the CPUID instruction, see the 
description of the CPUID instruction on page 160. 

The CPUID instruction can provide model information useful in determining the existence of a 
particular MSR. 

See “Model-Specific Registers (MSRs)” in Volume 2: System Programming, for more information 
about model-specific registers, machine check architecture, performance monitoring and debug 
registers. 


Description 

Write EDX:EAX to the MSR specified by ECX. 

Related Instructions 

RDMSR 

rFLAGS Affected 

None 


Mnemonic Opcode 

WRMSR OF 30 
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Exceptions 


Exception 

Real 

Virtual 

8086 

Protecte 

d 

Cause of Exception 

Invalid opcode, 

#UD 

X 

X 

X 

The WRMSR instruction is not supported, as indicated by 
CPUID FnOOOO 0001 EDX[MSR] = 0 OR CPUID 
Fn8000_0001_EDX[MSR] = 0. 

General protection, 
#GP 


X 

X 

CPL was not 0. 

X 


X 

The value in ECX specifies a reserved or unimplemented 

MSR address. 

X 


X 

Writing 1 to any bit that must be zero (MBZ) in the MSR. 

X 


X 

Writing a non-canonical value to a MSR that can only be 
written with canonical values. 
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Appendix A Opcode and Operand Encodings 


This appendix specifies the opcode and operand encodings for each instruction in the AMD64 
instruction set. As discussed in Chapter 1, “Instruction Encoding,” the basic operation and implied 
operand type(s) of an instruction are encoded by the binary value of the opcode byte. The 
correspondence between an opcode binary value and its meaning is provided by the opcode map. 

Each opcode map has 256 entries and can encode up to 256 different operations. Since the AMD64 
instruction set comprises more than 256 instructions, multiple opcode maps are utilized to encode the 
instruction set. A particular opcode map is selected using the instruction encoding syntax diagrammed 
in Figure 1-1 on page 2. For each opcode map, values may be reserved or utilized for purposes other 
than encoding an instruction operation. 

To preserve compatibility with future instruction architectural extensions, reserved opcodes should not 
be used. If a means to reliably cause an invalid-opcode exception (#UD) is required, software should 
use one of the UDx opcodes. These opcodes are set aside for this purpose and will not be used for 
future instructions. The UD opcodes are located on the secondary opcode map at code points B9h, 
OBh, and FFh. 

The following section provides a key to the notation used in the opcode maps to specify the implied 
operand types. 

Opcode-Syntax Notation 

In the opcode maps which follow, each table entry represents a specific fonn of an instruction, 
identifying the instruction by its mnemonic and listing the operand or operands peculiar to that 
opcode. If a register-based operand is specified by the opcode itself, the operand is represented directly 
using the register mnemonic as defined in “Summary of Registers and Data Types” on page 38. If the 
operand is encoded in one or more bytes following the opcode byte, the following special notation is 
used to represent the operand and its encoding in more generic terms. 

This special notation, used exclusively in the opcode maps, is composed of three parts: 

• an initial capital letter that represents the operand source / destination (register-based, memory- 
based, or immediate) and how it is encoded in the instruction (either as an immediate, or via the 
ModRM.reg, ModRM. {mod,r/m}, or VEX/XORvvvv fields). For register-based operands, the 
inital letter also specifies the register type (General-purpose, MMX, YMM/XMM, debug, or 
control register). 

• one, two, or three letter modifier (in lowercase) that represents the data type (for example, byte, 
word, quadword, packed single-precision floating-point vector). 

• x, which indicates for an SSE instruction that the instruction supports both vector sizes (128 bits 
and 256 bits). The specific vector size is encoded in the VEX/XORF field. L=0 indicates 128 bits 
and L=1 indicates 256 bits. 
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The following list describes the meaning of each letter that is used in the first position of the operand 

notation: 

A A far pointer encoded in the instruction. No ModRM byte in the instruction encoding. 

B General-purpose register specified by the VEX or XOP vvvv field. 

C Control register specified by the ModRM.reg field. 

D Debug register specified by the ModRM.reg field. 

E General purpose register or memory operand specified by the r/m field of the ModRM byte. For 
memory operands, the ModRM byte may be followed by a SIB byte to specify one of the indexed 
register-indirect addressing forms. 

F rFLAGS register. 

G General purpose register specified by the ModRM.reg field. 

H YMM or XMM register specified by the VEX/XORvvvv field. 

/ Immediate value encoded in the instruction immediate field. 

J The instruction encoding includes a relative offset that is added to the rlR 

L YMM or XMM register specified using the most-significant 4 bits of an 8-bit immediate value. 

In legacy or compatibility mode the most significant bit is ignored. 

M A memory operand specified by the {mod, r/m} field of the ModRM byte. ModRM.mod ^ 1 lb. 

M* A sparse array of memory operands addressed using the VSIB addressing mode. See “VSIB 
Addressing” in Volume 4. 

N 64-bit MMX register specified by the ModRM.r/m field. The ModRM.mod field must be lib. 

O The offset of an operand is encoded in the instruction. There is no ModRM byte in the instruction 
encoding. Indexed register-indirect addressing using the SIB byte is not supported. 

P 64-bit MMX register specified by the ModRM.reg field. 

Q 64-bit MMX-register or memory operand specified by the {mod, r/m} field of the ModRM byte. 
For memory operands, the ModRM byte may be followed by a SIB byte to specify one of the 
indexed register-indirect addressing forms. 

R General purpose register specified by the ModRM.r/m field. The ModRM.mod field must be 
lib. 

S Segment register specified by the ModRM.reg field. 

U YMM/XMM register specified by the ModRM.r/m field. The ModRM.mod field must be 1 lb. 

V YMM/XMM register specified by the ModRM.reg field. 

W YMM/XMM register or memory operand specified by the {mod, r/m} field of the ModRM byte. 
For memory operands, the ModRM byte may be followed by a SIB byte to specify one of the 
indexed register-indirect addressing forms. 


454 


Opcode and Operand Encodings 



24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


X A memory operand addressed by the DS.rSI registers. Used in string instructions. 

Y A memory operand addressed by the ES.rDI registers. Used in string instructions. 

The following list provides the key for the second part of the operand notation: 

a Two 16-bit or 32-bit memory operands, depending on the effective operand size. Used in the 
BOUND instruction. 

b A byte, irrespective of the effective operand size. 
c A byte or a word, depending on the effective operand size. 
d A doubleword (32 bits), irrespective of the effective operand size. 
do A double octword (256 bits), irrespective of the effective operand size. 
i A 16-bit integer. 

j A 32-bit integer. 

m A bit mask of size equal to the source operand. 

mn Where n = 2,4,8, or 16. A bit mask of size n. 

o An octword (128 bits), irrespective of the effective operand size. 

o.q Operand is either the upper or lower half of a 128-bit value. 

p A 32- or 48-bit far pointer, depending on 16- or 32-bit effective operand size. 

pb Vector with byte-wide (8-bit) elements (packed byte). 

pd A double-precision (64-bit) floating-point vector operand (packed double-precision). 
pdw Vector composed of 32-bit doublewords. 

ph A half-precision (16-bit) floating-point vector operand (packed half-precision) 

pi Vector composed of 16-bit integers (packed integer). 

pj Vector composed of 32-bit integers (packed double integer). 

pk Vector composed of 8-bit integers (packed half-word integer). 

pq Vector composed of 64-bit integers (packed quadword integer). 

pqw Vector composed of 64-bit quadwords (packed quadword). 

ps A single-precision floating-point vector operand (packed single-precision). 

pw Vector composed of 16-bit words (packed word). 

q A quadword (64 bits), irrespective of the effective operand size. 

5 A 6-byte or 10-byte pseudo-descriptor. 

sd A scalar double-precision floating-point operand (scalar double). 
sj A scalar doubleword (32-bit) integer operand (scalar double integer). 
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ss A scalar single-precision floating-point operand (scalar single). 

v A word, doubleword, or quadword (in 64-bit mode), depending on the effective operand size. 
w A word, irrespective of the effective operand size. 

x Instruction supports both vector sizes (128 bits or 256 bits). Size is encoded using the 

VEX/XOP.L field. (L=0: 128 bits; L=1: 256 bits). This symbol may be appended to ps or pd to 
represent a packed single- or double-precision floating-point vector of either size; or to pk,pi,pj, 
or pq, to represent a packed 8-bit, 16-bit, 32-bit, or 64-bit packed integer vector of either size. 

y A doubleword or quadword depending on effective operand size. 

z A word if the effective operand size is 16 bits, or a doubleword if the effective operand size is 32 

or 64 bits. 

For some instructions, fields in the ModRM or SIB byte are used as encoding extensions. This is 
indicated using the following notation: 

/n A ModRM-byte reg field or SIB-byte base field, where n is a value between zero (000b) and 7 
(111b). 

For SSE instructions that take scalar operands, VEX/XOP.L field is ignored. 

For immediates and memory-based operands, only the size and not the datatype is indicated. Operand 
widths and datatypes are specified based on the source operands. For instructions where the result 
overwrites one of the source registers, the data width and datatype of the result may not match that of 
the source register. See individual instruction descriptions for more details. 

A.1 Opcode Maps 

In all of the following opcode maps, cells shaded grey represent reserved opcodes. 

A.1.1 Legacy Opcode Maps 

Primary Opcode Map. Tables A-l and A-2 below show the primary opcode map (known in legacy 
terminology as one-byte opcodes). 

Table A-l below shows those instructions for which the low nibble is in the range 0-7h. Table A-2 on 
page 458 shows those instructions for which the low nibble is in the range 8-Fh. In both tables, the 
rows show the full range (0-Fh) of the high nibble, and the columns show the specified range of the 
low nibble. 
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Table A-1. Primary Opcode Map (One-byte Opcodes), Low Nibble 0-7h 


Nibble 1 

0 

1 

2 

3 

4 

5 

6 

7 

0 

ADD 

Eb, Gb Ev, Gv Gb, Eb Gv, Ev 

AL, lb rAX, Iz 

PUSH ES 3 

POP ES 3 

1 

ADC 

Eb, Gb Ev, Gv Gb, Eb Gv, Ev 

AL, lb rAX, Iz 

PUSH SS 3 

POP SS 3 

2 

AND 

Eb, Gb Ev, Gv Gb, Eb Gv, Ev 

AL, lb rAX, Iz 

seg ES 6 

DAA 3 

3 

XOR 

Eb, Gb Ev, Gv Gb, Eb Gv, Ev 

AL, lb rAX, Iz 

seg SS 6 

AAA 3 

4 

eAX eCX eDX 

INC / REX prefix 5 
eBX eSP eBP 

eSI 

eDI 

5 

rAX/r8 

rCX/r9 

rDX/rlO 

PU 

rBX/rll 

SH 

rSP/r12 

rBP/r13 

rSI/rl 4 

rDI/r15 

6 

PUSHA 3 

PUSHD 3 

POPA 3 

POPD 3 

BOUND 3 
Gv, Ma 

ARPL 3 

Ew, Gw 
MOVSXD 4 
Gv, Ez 

seg FS 
prefix 

seg GS 
prefix 

operand size 
override 
prefix 

address 
size override 
prefix 

7 

JO Jb 

JNO Jb 

JB Jb 

JNB Jb 

JZ Jb 

JNZ Jb 

JBE Jb 

JNBEJb 

8 

Eb, lb 

Group I 2 

Ev, Iz Eb, lb 3 

Ev, lb 

TEST 

Eb, Gb Ev, Gv 

XCHG 

Eb, Gb Ev, Gv 

9 

r8, rAX 
NOP,PAUSE 

rCX/r9, rAX 

rDX/rlO, rAX 

XC 

rBX/rll, rAX 

HG 

rSP/r12, rAX 

rBP/r13, rAX 

rSI/rl4, rAX 

rDI/r15, rAX 

A 

AL, Ob 

MOV 

rAX, Ov Ob, AL 

Ov, rAX 

MOVSB 

Yb, Xb 

MOVSW/D/Q 
Yv, Xv 

CMPSB 

Xb, Yb 

CMPSW/D/Q 
Xv, Yv 

B 

AL, lb 
r8b, lb 

CL, lb 
r9b, lb 

DL, lb 
rlOb, lb 

M( 

BL, lb 
rllb, lb 

DV 

AH, lb 
r12b, lb 

CH, lb 
r13b, lb 

DH, lb 
r14b, lb 

BH, lb 
r15b, lb 

C 

Groi 

Eb, lb 

ip 2 Z 

Ev, lb 

RET 

Iw 

near 

LES 3 Gz, Mp 
VEX escape 
prefix 

LDS 3 Gz, Mp 
VEX escape 
prefix 

Grou 

Eb, lb 

p II 2 

Ev, Iz 

D 

Eb, 1 

Groi 

Ev, 1 

ip 2 2 

Eb, CL 

Ev, CL 

AAM lb 3 

AAD lb 3 

invalid 

XLAT 

XLATB 

E 

LOO- 

PNE/NZJb 

LOOPE/Z 

Jb 

LOOPJb 

JrCXZ Jb 

II 

AL, lb 

SI 

eAX, lb 

OUT 

lb, AL lb, eAX 

F 

LOCK Prefix 

INTI 

REPNE 

Prefix 

REP / REPE 
Prefix 

HLT 

CMC 

Group 3 2 

Eb Ev 

Notes: 

1. Rows in this table show the high opcode nibble , columns show the low opcode nibble (both in hexadecimal). 

2. An opcode extension is specified using the reg field of the ModRM byte (ModRM bits [5:3]) which follows the opcode. 
See Table A-6 on page 465 for details. 

3. Invalid in 64-bit mode. 

4. Valid only in 64-bit mode. 

5. Used as REX prefixes in 64-bit mode. 

6. This is a null prefix in 64-bit mode. 
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Table A-2. Primary Opcode Map (One-byte Opcodes), Low Nibble 8-Fh 


Nibble 1 

8 

9 

A 

B 

c 

D 

E 

F 

0 

OR 

Eb, Gb Ev, Gv Gb, Eb Gv, Ev 

AL, lb rAX, Iz 

PUSH 

CS 3 

escape to 
secondary 
opcode map 

1 

SBB 

Eb, Gb Ev, Gv Gb, Eb Gv, Ev 

AL, lb rAX, Iz 

PUSH 

DS 3 

POP 

DS 3 

2 

SUB 

Eb, Gb Ev, Gv Gb, Eb Gv, Ev 

AL, lb rAX, Iz 

seg CS 6 

DAS 3 

3 

CMP 

Eb, Gb Ev, Gv Gb, Eb Gv, Ev 

AL, lb rAX, Iz 

seg DS 6 

AAS 3 

4 

eAX eCX eDX 

DEC 3 / REX prefix 5 

eBX eSP eBP 

eSI 

eDI 

5 

rAX/r8 

rCX/r9 

rDX/rlO 

PC 

rBX/rll 

)P 

rSP/r12 

rBP/r13 

rSI/rl 4 

rDI/r15 

6 

PUSH 

Iz 

IMUL 

Gv, Ev, Iz 

PUSH 

lb 

IMUL 

Gv, Ev, lb 

INSB 

Yb, DX 

INSW/D 

Yz, DX 

OUTS / 
OUTSB 

DX, Xb 

OUTS 
OUTSW/D 
DX, Xz 

7 

JS Jb 

JNS Jb 

JP Jb 

JNP Jb 

JL Jb 

JNL Jb 

JLE Jb 

JNLEJb 

8 

Eb, Gb 

Ev, Gv 

MOV 

Gb, Eb 

Gv, Ev 

Mw/Rv, Sw 

LEA 

Gv, M 

MOV 

Sw, Ew 

Group la 2 
XOP escape 
prefix 

9 

CBW, CWDE 
CDQE 

CWD, CDQ, 
CQO 

CALL 3 

Ap 

WAIT 

FWAIT 

PUSHF/D/Q 

Fv 

POPF/D/Q 

Fv 

SAHF 

LAHF 

A 

TEST 

AL, lb rAX, Iz 

STOSB 

Yb, AL 

STOSW/D/Q 
Yv, rAX 

LODSB 

AL, Xb 

LODSW/D/Q 
rAX, Xv 

SCASB 

AL, Yb 

SCASW/D/Q 
rAX, Yv 

B 

rAX, Iv 
r8, Iv 

rCX, Iv 
r9, Iv 

rDX, Iv 
rIO, Iv 

M( 

rBX, Iv 
rll, Iv 

DV 

rSP, Iv 
r12, Iv 

rBP, Iv 
r13, Iv 

rSI, Iv 
r14, Iv 

rDI, Iv 
r15, Iv 

c 

ENTER 

Iw, lb 

LEAVE 

RET far 

Iw j 

INT3 

INT lb 

INTO 3 

IRET, IRETD, 

IRETQ 

D 

x87 Instructions 

see Table A-15 on page 477 

E 

CALL Jz 

Jz 

JMP 

Ap 3 

Jb 

II 

AL, DX 

eAX, DX 

01 

DX, AL 

JT 

DX, eAX 

F 

CLC 

STC 

CLI 

STI 

CLD 

STD 

Group 4 2 

Eb 

Group 5 2 

Notes: 

1. Rows in this table show the high opcode nibble , columns show the low opcode nibble (both in hexadecimal). 

2. An opcode extension is specified using the reg field of the ModRM byte (ModRM bits [5:3]) which follows the opcode. 
See Table A-6 on page 465 for details. 

3. Invalid in 64-bit mode. 

4. Valid only in 64-bit mode. 

5. Used as REX prefixes in 64-bit mode. 

6. This is a null prefix in 64-bit mode. 


Secondary Opcode Map. As described in “Encoding Syntax” on page 1, the escape code OFh 
indicates the switch from the primary to the secondary opcode map. In legacy tenninology, the 
secondary opcode map is presented as a listing of “two-byte” opcodes where the first byte is OFh. 
Tables A-3 and A-4 show the secondary opcode map. 
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Table A-3 below shows those instructions for which the low nibble is in the range 0-7h. Table A-4 on 
page 462 shows those instructions for which the low nibble is in the range 8-Fh. In both tables, the 
rows show the full range (O-Fh) of the high nibble, and the columns show the specified range of the 
low nibble. Note the added column labeled “prefix.” 

For the secondary opcode map shown below, the legacy prefixes 66h, F2h, and F3 are repurposed to 
provide additional opcode encoding space. For those rows that utilize them, the presence of a 66h, 
F2h, or F3h prefix changes the operation or the operand types specified by the corresponding opcode 
value. 

As discussed in “Encoding Extensions Using the ModRM Byte” on page 465, some opcode values 
represent a group of instructions. This is denoted in the map entry by “Group n”, where n = [1:17,P]. 
Instructions within a group are encoded by the reg field of the ModRM byte. These encodings are 
specified in Table A-7 on page 467. For some opcodes, both the reg and the r/m field of the ModRM 
byte are used to extend the encoding. See Table A-8 on page 468. 
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Table A-3. Secondary Opcode Map (Two-byte Opcodes), Low Nibble 0-7h 


Prefix 

Nibble 1 

0 

1 

2 

3 

4 

5 

6 

7 

n/a 

0 

Group 6 2 

Group 7 2 

LAR 

Gv, Ew 

LSL 

Gv, Ew 


SYSCALL 

CLTS 

SYSRET 

none 

1 

MOV 

Vps, Wps 

UPS 

Wps, Vps 

MOVLPS 
Vq, Mq 

MOVHLPS 
Vo.q, Uo.q 

MOVLPS 
Mq, Vq 

UNPCKLPS 
Vps,Wps 

UNPCKHPS 
Vps,Wps 

MOVHPS 
Vo.q, Mq 

MOVLHPS 
Vo.q, Uo.q 

MOVHPS 

Mq, Vo.q 

F3 

MOVSS 

Vss, Wss | Wss, Vss 

MOVSLDUP 
Vps, Wps 




MOVSHDUP 
Vps, Wps 


66 

MOVUPD 

Vpd, Wpd Wpd, Vpd 

MOV 

Vo.q, Mq 

'LPD 

Mq, Vo.q 

UNPCKLPD 
Vo.q, Wo.q 

UNPCKHPD 
Vo.q, Wo.q 

MC 

Vo.q, Mq 

IVHPD 

Mq, Vo.q 

F2 

MOVSD 

Vsd, Wsd Wsd, Vsd 

MOVDDUP 
Vo, Wsd 






n/a 

2 

Rd/q, Cd/q 

MC 

Rd/q, Dd/q 

>V 4 

Cd/q, Rd/q 

Dd/q, Rd/q 





n/a 

3 

WRMSR 

RDTSC 

RDMSR 

RDPMC 

SYSENTER 3 

SYSEXIT 3 



n/a 

4 

CMOVO 
Gv, Ev 

CMOVNO 
Gv, Ev 

CMOVB 

Gv, Ev 

CMOVNB 
Gv, Ev 

CMOVZ 

Gv, Ev 

CMOVNZ 
Gv, Ev 

CMOVBE 
Gv, Ev 

CMOVNBE 

Gv, Ev 

none 

5 

MOVMSKPS 
Gd, Ups 

SQRTPS 
Vps, Wps 

RSQRTPS 
Vps, Wps 

RCPPS 
Vps, Wps 

ANDPS 
Vps, Wps 

ANDNPS 
Vps, Wps 

ORPS 
Vps, Wps 

XORPS 

Vps, Wps 

F3 


SQRTSS 
Vss, Wss 

RSQRTSS 
Vss, Wss 

RCPSS 
Vss, Wss 





66 

MOVMSKPD 
Gd, Upd 

SQRTPD 
Vpd, Wpd 



ANDPD 
Vpd, Wpd 

ANDNPD 
Vpd, Wpd 

ORPD 
Vpd, Wpd 

XORPD 

Vpd, Wpd 

F2 


SQRTSD 
Vsd, Wsd 







none 

6 

PUNPCK- 

LBW 

Pq, Qd 

PUNPCK- 

LWD 

Pq, Qd 

PUNPCK- 

LDQ 

Pq, Qd 

PACKSSWB 
Ppi, Qpi 

PCMPGTB 
Ppk, Qpk 

PCMPGTW 
Ppi, Qpi 

PCMPGTD 

Ppj, Qpj 

PACKUSWB 
Ppi, Qpi 

F3 









66 

PUNPCK- 

LBW 

Vo.q, Wo.q 

PUNPCK- 

LWD 

Vo.q, Wo.q 

PUNPCK- 

LDQ 

Vo.q, Wo.q 

PACKSSWB 
Vpi, Wpi 

PCMPGTB 
Vpk, Wpk 

PCMPGTW 
Vpi, Wpi 

PCMPGTD 
Vpj, Wpj 

PACKUSWB 
Vpi, Wpi 

F2 









none 

7 

PSHUFW 
Pq, Qq, lb 

Group 12 2 

Group 13 2 

Group 14 2 

PCMPEQB 
Ppk, Qpk 

PCMPEQW 
Ppi, Qpi 

PCMPEQD 

Ppj, Qpj 

EMMS 

F3 

PSHUFHW 
Vq, Wq, lb 





66 

PSHUFD 
Vo, Wo, lb 

PCMPEQB 
Vpk, Wpk 

PCMPEQW 
Vpi, Wpi 

PCMPEQD 
Vpj, Wpj 


F2 

PSHUFLW 

Vq, Wq, lb 






Notes: 


1. Rows show the high opcode nibble, columns show the low opcode nibble (both in hexadecimal). All opcodes in this 
map are immediately preceeded in the instruction encoding by the escape byte OFh. 

2. An opcode extension is specified using the reg field of the ModRM byte (ModRM bits [5:3]) which follows the opcode. 
See Table A-7 on page 467 for details. 

3. Invalid in long mode. 

4. Operand size is based on processor mode. 


460 


Opcode and Operand Encodings 




24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


Table A-3. Secondary Opcode Map (Two-byte Opcodes), Low Nibble 0-7h (continued) 


Prefix 

Nibble 1 

0 

1 

2 

3 

4 

5 

6 

7 

n/a 

8 

JO Jz 

JNO Jz 

JB Jz 

JNB Jz 

JZ Jz 

JNZ Jz 

JBE Jz 

JNBEJz 

n/a 

9 

SETO Eb 

SETNO Eb 

SETB Eb 

SETNB Eb 

SETZ Eb 

SETNZ Eb 

SETBE Eb 

SETNBE Eb 

n/a 

A 

PUSH FS 

POP FS 

CPUID 

BT Ev, Gv 

SH 

Ev, Gv, lb 

LD 

Ev, Gv, CL 



n/a 

B 

CMPXCHG 

Eb, Gb Ev, Gv 

LSS Gz, Mp 

BTR Ev, Gv 

LFS Gz, Mp 

LGS Gz, Mp 

Ml 

Gv, Eb 

DVZX 

Gv, Ew 

none 

C 

XADD 

CMPPS 
Vps, Wps, lb 

MOVNTI 
My, Gy 

PINSRW 
Pq, Ew, lb 

PEXTRW 
Gd, Nq, lb 

SHUFPS 
Vps, Wps, lb 

Group 9 2 

Mq 

F3 

Eb, Gb 

Ev, Gv 

CMPSS 
Vss, Wss, lb 





66 

CMPPD 

Vpd, Wpd, lb 


PINSRW 

Vo, Ew, lb 

PEXTRW 

Gd, Uo, lb 

SHUFPD 

Vpd, Wpd, lb 

F2 

CMPSD 

Vsd, Wsd, lb 





none 

D 


PSRLW 

Pq, Qq 

PSRLD 

Pq, Qq 

PSRLQ 

Pq, Qq 

PADDQ 

Pq, Qq 

PMULLW 
Pq, Qq 


PMOVMSKB 
Gd, Nq 

F3 







MOVQ2DQ 
Vo, Nq 


66 

ADDSUBPD 
Vpd, Wpd 

PSRLW 

Vo, Wo 

PSRLD 

Vo, Wo 

PSRLQ 

Vo, Wo 

PADDQ 

Vo, Wo 

PMULLW 
Vo, Wo 

MOVQ 
Wq, Vq 

PMOVMSKB 
Gd, Uo 

F2 

ADDSUBPS 
Vps, Wps 






MOVDQ2Q 
Pq, Uq 


none 

E 

PAVGB 

Pq, Qq 

PSRAW 
Pq, Qq 

PSRAD 

Pq, Qq 

PAVGW 

Pq, Qq 

PMULHUW 
Pq, Qq 

PMULHW 
Pq, Qq 


MOVNTQ 

Mq, Pq 

F3 







CVTDQ2PD 
Vpd, Wpj 


66 

PAVGB 

Vo, Wo 

PSRAW 

Vo, Wo 

PSRAD 

Vo, Wo 

PAVGW 

Vo, Wo 

PMULHUW 
Vo, Wo 

PMULHW 
Vo, Wo 

CVTTPD2DQ 
Vpj, Wpd 

MOVNTDQ 
Mo, Vo 

F2 







CVTPD2DQ 
Vpj, Wpd 


none 

F 


PSLLW 

Pq, Qq 

PSLLD 

Pq, Qq 

PSLLQ 

Pq, Qq 

PMULUDQ 
Pq, Qq 

PMADDWD 
Pq, Qq 

PSADBW 
Pq, Qq 

MASKMOVQ 
Pq, Nq 

F3 









66 


PSLLW 
Vpw, Wo.q 

PSLLD 
Vpwd, Wo.q 

PSLLQ 
Vpqw, Wo.q 

PMULUDQ 
Vpj, Wpj 

PMADDWD 
Vpi, Wpi 

PSADBW 
Vpk, Wpk 

MASKMOVDQU 
Vpb, Upb 

F2 

LDDQU 

Vo, Mo 









Notes: 


1. Rows show the high opcode nibble, columns show the low opcode nibble (both in hexadecimal). All opcodes in this 
map are immediately preceeded in the instruction encoding by the escape byte OFh. 

2. An opcode extension is specified using the reg field of the ModRM byte (ModRM bits [5:3]) which follows the opcode. 
See Table A-7 on page 467 for details. 

3. Invalid in long mode. 

4. Operand size is based on processor mode. 
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Table A-4. Secondary Opcode Map (Two-byte Opcodes), Low Nibble 8-Fh 


Prefix 

Nibble 1 

8 

9 

A 

B 

C 

D 

E 

F 

n/a 

0 

INVD 

WBINVD 

(F3) 

WBNOINVD 


UD2 


Group P 2 

PREFETCH 

FEMMS 

3DNow! 

See 

“3DNow!™ 
Opcodes” 
on page 473 

n/a 

1 

Group 16 2 

NOP 3 

NOP 3 

NOP 3 

NOP 3 

NOP 3 

NOP 3 

NOP 3 

none 

2 

MOV 

Vps, Wps 

APS 

Wps, Vps 

CVTPI2PS 

Vps, Qpj 

MOVNTPS 

Mo, Vps 

CVTTPS2PI 

Ppj, Wps 

CVTPS2PI 

Ppj, Wps 

UCOMISS 

Vss, Wss 

COMISS 

Vss, Wss 

F3 



CVTSI2SS 

Vss, Ey 

MOVNTSS 

Md, Vss 

CVTTSS2SI 

Gy, Wss 

CVTSS2SI 

Gy, Wss 



66 

MOV 

Vpd, Wpd 

APD 

Wpd, Vpd 

CVTPI2PD 

Vpd, Qpj 

MOVNTPD 

Mo, Vpd 

CVTTPD2PI 

Ppj, Wpd 

CVTPD2PI 

Ppj, Wpd 

UCOMISD 

Vsd, Wsd 

COMISD 

Vsd, Wsd 

F2 



CVTSI2SD 

Vsd, Ey 

MOVNTSD 

Mq, Vsd 

CVTTSD2SI 

Gy, Wsd 

CVTSD2SI 

Gy, Wsd 



n/a 

3 

Escape to 
0F_38h 
opcode map 


Escape to 
0F_3Ah 
opcode map 






n/a 

4 

CMOVS 

Gv, Ev 

CMOVNS 

Gv, Ev 

CMOVP 

Gv, Ev 

CMOVNP 

Gv, Ev 

CMOVL 

Gv, Ev 

CMOVNL 

Gv, Ev 

CMOVLE 

Gv, Ev 

CMOVNLE 

Gv, Ev 

none 

5 

ADDPS 

Vps, Wps 

MULPS 

Vps, Wps 

CVTPS2PD 

Vpd, Wps 

CVTDQ2PS 

Vps, Wo 

SUBPS 

Vps, Wps 

MINPS 

Vps, Wps 

DIVPS 

Vps, Wps 

MAXPS 

Vps, Wps 

F3 

ADDSS 

Vss, Wss 

MULSS 

Vss, Wss 

CVTSS2SD 

Vsd, Wss 

CVTTPS2DQ 

Vo, Wps 

SUBSS 

Vss, Wss 

MINSS 

Vss, Wss 

DIVSS 

Vss, Wss 

MAXSS 

Vss, Wss 

66 

ADDPD 

Vpd, Wpd 

MULPD 

Vpd, Wpd 

CVTPD2PS 

Vps, Wpd 

CVTPS2DQ 

Vo, Wps 

SUBPD 

Vpd, Wpd 

MINPD 

Vpd, Wpd 

DIVPD 

Vpd, Wpd 

MAXPD 

Vpd, Wpd 

F2 

ADDSD 

Vsd, Wsd 

MULSD 

Vsd, Wsd 

CVTSD2SS 

Vss, Wsd 


SUBSD 

Vsd, Wsd 

MINSD 

Vsd, Wsd 

DIVSD 

Vsd, Wsd 

MAXSD 

Vsd, Wsd 

none 

6 

PUNPCK- 

HBW 

Pq, Qd 

PUNPCK- 

HWD 

Pq, Qd 

PUNPCK- 

HDQ 

Pq, Qd 

PACKSSDW 

Pq, Qq 



MOVD 

Py, Ey 

MOVQ 

Pq, Qq 

F3 








MOVDQU 

Vo, Wo 

66 

PUNPCK- 

HBW 

Vo, Wq 

PUNPCK- 

HWD 

Vo, Wq 

PUNPCK- 

HDQ 

Vo, Wq 

PACKSSDW 

Vo, Wo 

PUNPCK- 

LQDQ 

Vo, Wq 

PUNPCKH- 

QDQ 

Vo, Wq 

MOVD 

Vy,Ey 

MOVDQA 

Vo, Wo 

F2 










Notes: 


1. Rows show the high opcode nibble, columns show the low opcode nibble (both in hexadecimal). All opcodes in this 
map are immediately preceeded in the instruction encoding by the escape byte OFh. 

2. An opcode extension is specified using the reg field of the ModRM byte (ModRM bits [5:3]) which follows the opcode. 
See Table A-7 on page 467 for details. 

3. This instruction takes a ModRM byte. 
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Table A-4. Secondary Opcode Map (Two-byte Opcodes), Low Nibble 8-Fh 


Prefix 

Nibble 1 

8 

9 

A 

B 

C 

D 

E 

F 

none 








MOVD 

Ey, Py 

MOVQ 

Qq, Pq 

F3 








MOVQ 

MOVDQU 








Vq, Wq 

Wo, Vo 

66 

7 

Group 17 2 

EXTRQ 



HADDPD 

HSUBPD 

MOVD 

MOVDQA 



Vo.q, Uo 



Vpd, Wpd 

Vpd, Wpd 

Ey, Vy 

Wo, Vo 



INSERTQ 

INSERTQ 



HADDPS 

HSUBPS 



F2 


Vo.q, Uo.q, 
lb, lb 

Vo.q, Uo 



Vps, Wps 

Vps, Wps 



n/a 

8 

JS 

JNS 

JP 

JNP 

JL 

JNL 

JLE 

JNLE 

Jz 

Jz 

Jz 

Jz 

Jz 

Jz 

Jz 

Jz 

n/a 

Q 

SETS 

SETNS 

SETP 

SETNP 

SETL 

SETNL 

SETLE 

SETNLE 


Eb 

Eb 

Eb 

Eb 

Eb 

Eb 

Eb 

Eb 

n/a 

A 

PUSH 

POP 

RSM 

BTS 

SHRD 

Group 15 2 

IMUL 

M 

GS 

GS 


Ev, Gv 

Ev, Gv, lb 

Ev, Gv, CL 


Gv, Ev 

none 



Group 10 2 

Group 8 2 

BTC 

BSF 

BSR 

MOVSX 




Ev, lb 

Ev, Gv 

Gv, Ev 

Gv, Ev 

Gv, Eb 

Gv, Ew 

F3 

B 

POPCNT 




TZCNT 

LZCNT 



Gv, Ev 




Gv, Ev 

Gv, Ev 



F2 










n/a 

C 

rAX/r8 

rCX/r9 

rDX/rlO 

BSV 

rBX/rll 

JAP 

rSP/r12 

rBP/r13 

rSI/rl 4 

rDI/r15 

none 


PSUBUSB 

PSUBUSW 

PMINUB 

PAND 

PADDUSB 

PADDUSW 

PMAXUB 

PANDN 


Pq, Qq 

Pq, Qq 

Pq, Qq 

Pq, Qq 

Pq, Qq 

Pq, Qq 

Pq, Qq 

Pq, Qq 

F3 

n 









66 

u 

PSUBUSB 

PSUBUSW 

PMINUB 

PAND 

PADDUSB 

PADDUSW 

PMAXUB 

PANDN 


Vo, Wo 

Vo, Wo 

Vo, Wo 

Vo, Wo 

Vo, Wo 

Vo, Wo 

Vo, Wo 

Vo, Wo 

F2 










none 


PSUBSB 

PSUBSW 

PMINSW 

POR 

PADDSB 

PADDSW 

PMAXSW 

PXOR 


Pq, Qq 

Pq, Qq 

Pq, Qq 

Pq, Qq 

Pq, Qq 

Pq, Qq 

Pq, Qq 

Pq, Qq 

F3 

F 









66 


PSUBSB 

PSUBSW 

PMINSW 

POR 

PADDSB 

PADDSW 

PMAXSW 

PXOR 


Vo, Wo 

Vo, Wo 

Vo, Wo 

Vo, Wo 

Vo, Wo 

Vo, Wo 

Vo, Wo 

Vo, Wo 

F2 











Notes: 


1. Rows show the high opcode nibble, columns show the low opcode nibble (both in hexadecimal). All opcodes in this 
map are immediately preceeded in the instruction encoding by the escape byte OFh. 

2. An opcode extension is specified using the reg field of the ModRM byte (ModRM bits [5:3]) which follows the opcode. 
See Table A-7 on page 467 for details. 

3. This instruction takes a ModRM byte. 
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Table A-4. Secondary Opcode Map (Two-byte Opcodes), Low Nibble 8-Fh 


Prefix 

Nibble 1 

8 

9 

A 

B 

c 

D 

E 

F 

none 

F 

PSUBB 

Pq, Qq 

PSUBW 

Pq, Qq 

PSUBD 

Pq, Qq 

PSUBQ 

Pq, Qq 

PADDB 

Pq, Qq 

PADDW 

Pq, Qq 

PADDD 

Pq, Qq 

UDO 

F3 








66 

PSUBB 

Vo, Wo 

PSUBW 

Vo, Wo 

PSUBD 

Vo, Wo 

PSUBQ 

Vo, Wo 

PADDB 

Vo, Wo 

PADDW 

Vo, Wo 

PADDD 

Vo, Wo 

F2 









Notes: 

1. Rows show the high opcode nibble , columns show the low opcode nibble (both in hexadecimal). All opcodes in this 
map are immediately preceeded in the instruction encoding by the escape byte OFh. 

2. An opcode extension is specified using the reg field of the ModRM byte (ModRM bits [5:3]) which follows the opcode. 
See Table A-7 on page 467 for details. 

3. This instruction takes a ModRM byte. 


rFLAGS Condition Codes for CMOVcc, Jcc, and SETcc Instructions. Table A-5 shows 
the rFLAGS condition codes specified by the low nibble in the opcode of the CMOVcc, Jcc, and 
SETcc instructions. 


Table A-5. rFLAGS Condition Codes for CMOVcc, Jcc, and SETcc 


Low Nibble of 
Opcode (hex) 

rFLAGS Value 

cc Mnemonic 

Arithmetic 

Type 

Condition(s) 

0 

OF = 1 

0 

Signed 

Overflow 

1 

OF = 0 

NO 

No Overflow 

2 

OF = 1 

B, C, NAE 

Unsigned 

Below, Carry, Not Above or Equal 

3 

OF = 0 

NB, NO, AE 

Not Below, No Carry, Above or Equal 

4 

ZF = 1 

Z, E 

Zero, Equal 

5 

ZF = 0 

NZ, NE 

Not Zero, Not Equal 

6 

OF = 1 orZF = 1 

BE, NA 

Below or Equal, Not Above 

7 

OF = 0 and ZF = 0 

NBE, A 

Not Below or Equal, Above 

8 

SF = 1 

S 

Signed 

Sign 

9 

SF = 0 

NS 

Not Sign 

A 

PF = 1 

P, PE 

n/a 

Parity, Parity Even 

B 

PF = 0 

NP, PO 

Not Parity, Parity Odd 

C 

(SF xor OF) = 1 

L, NGE 

Signed 

Less than, Not Greater than or Equal to 

D 

(SF xor OF) = 0 

NL, GE 

Not Less than, Greater than or Equal to 

E 

(SF xor OF) = 1 
or ZF = 1 

LE, NG 

Less than or Equal to, Not Greater than 

F 

(SF xor OF) = 0 
and ZF = 0 

NLE, G 

Not Less than or Equal to, Greater than 
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Encoding Extensions Using the ModRM Byte. The ModRM byte, which immediately 
follows the opcode byte, is used in certain instruction encodings to provide additional opcode bits with 
which to define the function of the instruction. ModRM bytes have three fields— mod , reg, and r/m, as 
shown in Figure A-1. 


Bits: 


7 6 5 4 3 2 10 


mod 


reg 


r/m 


513-325.eps 


Figure A-1. ModRM-Byte Fields 

In most cases, the reg field (bits [5:3]), and in some cases, the r/m field (bits [2:0]) provide the 
additional bits used to extend the encodings of the opcode byte. In the case of the x87 floating-point 
instructions, the entire ModRM byte is used to extend the opcode encodings. 

Table A-6 shows how the ModRM.reg field is used to extend the range of opcodes in the primary 
opcode map. The opcode ranges are organized into groups of opcode extensions. The group number is 
shown in the left-most column. These groups are referenced in the primary opcode map shown in 
Table A-1 on page 457 and Table A-2 on page 458. An entry of “n.a.” in the Prefix column means that 
prefixes are not applicable to the opcodes in that row. Prefixes only apply to certain 64-bit media and 
SSE instructions. 

Table A-7 on page 467 shows how the ModRM.reg field is used to extend the range of the opcodes in 
the secondary opcode map. 

The /0 through /7 notation for the ModRM reg field (bits [5:3]) in the tables below means that the 
three-bit field contains a value from zero (000b) to 7 (11 lb). 


Table A-6. ModRM.reg Extensions for the Primary Opcode Map 1 


Group 

Number 

Prefix 

Opcode 

ModRM reg Field 

10 

/I 

12 

/ 3 

/ 4 

15 

16 

17 

Group 1 

n/a 

80 

ADD 

Eb, lb 

OR 

Eb, lb 

ADC 

Eb, lb 

SBB 

Eb, lb 

AND 

Eb, lb 

SUB 

Eb, lb 

XOR 

Eb, lb 

CMP 

Eb, lb 

81 

ADD 

Ev, Iz 

OR 

Ev, Iz 

ADC 

Ev, Iz 

SBB 

Ev, Iz 

AND 

Ev, Iz 

SUB 

Ev, Iz 

XOR 

Ev, Iz 

CMP 

Ev, Iz 

82 

ADD 

Eb, lb 2 

OR 

Eb, lb 2 

ADC 

Eb, lb 2 

SBB 

Eb, lb 2 

AND 

Eb, lb 2 

SUB 

Eb, lb 2 

XOR 

Eb, lb 2 

CMP 

Eb, lb 2 

83 

ADD 

Ev, lb 

OR 

Ev, lb 

ADC 

Ev, lb 

SBB 

Ev, lb 

AND 

Ev, lb 

SUB 

Ev, lb 

XOR 

Ev, lb 

CMP 

Ev, lb 


Notes: 

1. See Table A-7 on page 467 for ModRM extensions for the secondary (two-byte) ocode map. 

2. Invalid in 64-bit mode. 

3. This instruction takes a ModRM byte. 

4. Reserved prefetch encodings are aliased to the 70 encoding (PREFETCH Exclusive) for future compatibility. 

5. Redundant encoding generally unsupported by tools.. 
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Table A-6. ModRM.reg Extensions for the Primary Opcode Map 1 (continued) 


Group 

Number 

Prefix 

Opcode 

ModRM reg Field 

10 

/I 

12 

/ 3 

/ 4 

15 

16 

17 

Group la 

n/a 

8F 

POP 

Ev 








Group 2 

n/a 

CO 

ROL 

Eb, lb 

ROR 

Eb, lb 

RCL 

Eb, lb 

RCR 

Eb, lb 

SHL/SAL 

Eb, lb 

SHR 

Eb, lb 

SHL/SAL 5 
Eb, lb 

SAR 

Eb, lb 

Cl 

ROL 

Ev, lb 

ROR 

Ev, lb 

RCL 

Ev, lb 

RCR 

Ev, lb 

SHL/SAL 

Ev, lb 

SHR 

Ev, lb 

SHL/SAL 5 
Ev, lb 

SAR 

Ev, lb 

DO 

ROL 

Eb, 1 

ROR 

Eb, 1 

RCL 

Eb, 1 

RCR 

Eb, 1 

SHL/SAL 

Eb, 1 

SHR 

Eb, 1 

SHL/SAL 5 
Eb, 1 

SAR 

Eb, 1 

D1 

ROL 

Ev, 1 

ROR 

Ev, 1 

RCL 

Ev, 1 

RCR 

Ev, 1 

SHL/SAL 

Ev, 1 

SHR 

Ev, 1 

SHL/SAL 5 
Ev, 1 

SAR 

Ev, 1 

D2 

ROL 

Eb, CL 

ROR 

Eb, CL 

RCL 

Eb, CL 

RCR 

Eb, CL 

SHL/SAL 

Eb, CL 

SHR 

Eb, CL 

SHL/SAL 5 
Eb, CL 

SAR 

Eb, CL 

D3 

ROL 

Ev, CL 

ROR 

Ev, CL 

RCL 

Ev, CL 

RCR 

Ev, CL 

SHL/SAL 

Ev, CL 

SHR 

Ev, CL 

SHL/SAL 5 
Ev, CL 

SAR 

Ev, CL 

Group 3 

n/a 

F6 

TEST 

Eb,lb 

NOT 

Eb 

NEG 

Eb 

MUL 

Eb 

IMUL 

Eb 

DIV 

Eb 

IDIV 

Eb 

F7 

TEST 

Ev,lz 

NOT 

Ev 

NEG 

Ev 

MUL 

Ev 

IMUL 

Ev 

DIV 

Ev 

IDIV 

Ev 

Group 4 

n/a 

FE 

INC 

Eb 

DEC 

Eb 







Group 5 

n/a 

FF 

INC 

Ev 

DEC 

Ev 

CALL 

Ev 

CALL 

Mp 

JMP 

Ev 

JMP 

Mp 

PUSH 

Ev 



Notes: 


1. See Table A-7 on page 467 for ModRM extensions for the secondary (two-byte) ocode map. 

2. Invalid in 64-bit mode. 

3. This instruction takes a ModRM byte. 

4. Reserved prefetch encodings are aliased to the 70 encoding (PREFETCH Exclusive) for future compatibility. 

5. Redundant encoding generally unsupported by tools.. 
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Table A-7. ModRM.reg Extensions for the Secondary Opcode Map 


Group 

Number 

Prefix 

Opcode 

ModRM reg Field 

10 

/I 

12 

13 

/ 4 

15 

16 

17 

Group 6 

n/a 

00 

SLDT 

Mw/Rv 

STR 

Mw/Rv 

LLDT Ew 

LTR Ew 

VERR Ew 

VERW Ew 



Group 7 

n/a 

01 

SGDT 

Ms 

SIDT 

Ms 

MONITOR 1 

MWAIT 

LGDT Ms 

XGETBV 1 

XSETBV 

LIDT Ms 

SVM 1 

SMSWMw 
/ Rv 


LMSW Ew 

INVLPG 

Mb 

SWAPGS 1 

RDTSCP 

Group 8 

n/a 

BA 





BT Ev, lb 

BTS Ev, lb 

BTR Ev, lb 

BTC Ev, lb 

Group 9 

n/a 

C7 


CMPX- 
CHG8B Mq 

CMPX- 

CHG16B 

Mo 





RDRAND 

Rv 


Group 10 

n/a 

B9 

UD1 

Group 11 

n/a 

C6 

MOV 

Eb,lb 








n/a 

C7 

MOV 

Ev,lz 








Group 12 

none 

71 



PSRLW 

Nq, lb 


PSRAW 

Nq, lb 


PSLLW 

Nq, lb 


66 



PSRLW 

Uo, lb 


PSRAW 

Uo, lb 


PSLLW 

Uo, lb 


F2, F3 









Group 13 

none 

72 



PSRLD 

Nq, lb 


PSRAD 

Nq, lb 


PSLLD 

Nq, lb 


66 



PSRLD 

Uo, lb 


PSRAD 

Uo, lb 


PSLLD 

Uo, lb 


F2, F3 










Notes: 


1. Opcode is extended further using the r/m field of the ModRM byte in conjunction with the reg field. See Table A-8 
on page 468 for ModRM. r/m extensions of this opcode. 

2. Invalid in 64-bit mode. 

3. This instruction takes a ModRM byte. 

4. Reserved prefetch encodings are aliased to the /0 encoding (PREFETCH Exclusive) for future compatibility. 

5. ModRM. mod = 11b. 

6. ModRM. mod # 11b. 

7. ModRM.mod + 11b, ModRM.mod = 11b is an invalid encoding. 
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Table A-7. ModRM.reg Extensions for the Secondary Opcode Map (continued) 


Group 

Number 

Prefix 

Opcode 

ModRM reg Field 

10 

/I 

12 

/ 3 

/ 4 

15 

16 

17 

Group 14 

none 

73 



PSRLQ 

Nq, lb 




PSLLQ 

Nq, lb 


66 



PSRLQ 

Uo, lb 

PSRLDQ 

Uo, lb 



PSLLQ 

Uo, lb 

PSLLDQ 

Uo, lb 

F2, F3 









Group 15 

none 

AE 

FXSAVE 

M 

FXRSTOR 

M 

LDMXCSR 

Md 

STMXCSR 

Md 

XSAVE M 6 

LFENCE 5 

XRSTOR 

M 6 

MFENCE 5 

XSAVE- 
OPT M 6 

SFENCE 5 

CLFLUSH 

Mb 6 

F3 

RDFSBASE 

Rv 

RDGSBASE 

Rv 

WRFSBASE 

Rv 

WRGSBASE 

Rv 





F2 









66 







CLWB Mb 6 


Group 16 

n/a. 

18 

PREFETCH 

NTA 

PREFETCH 

TO 

PREFETCH 

T1 

PREFETCH 

T2 

NOP 4 

NOP 4 

NOP 4 

NOP 4 

Group 17 

66 

78 

EXTRQ 

Vo.q, lb, lb 








none, 
F2, F3 









Group P 

n/a. 

0D 

PREFETCH 

Exclusive 

PREFETCH 

Modified 

PREFETCH 

PREFETCH 

Modified 

PREFETCH 

PREFETCH 

PREFETCH 

PREFETCH 







Notes: 


1. Opcode is extended further using the r/m field of the ModRM byte in conjunction with the reg field. See Table A-8 
on page 468 for ModRM. r/m extensions of this opcode. 

2. Invalid in 64-bit mode. 

3. This instruction takes a ModRM byte. 

4. Reserved prefetch encodings are aliased to the /0 encoding (PREFETCH Exclusive) for future compatibility. 

5. ModRM. mod = 11b. 

6. ModRM. mod # 11b. 

7. ModRM. mod + 11b, ModRM. mod = 11b is an invalid encoding. 


Secondary Opcode Map, ModRM Extensions for Opcode 01 h . Table A-8 below shows 
the ModRM byte encodings for the Olh opcode. In the table the full ModRM byte is listed below the 
instruction in hexadecimal. For all instructions shown, the ModRM byte is immediately preceeded by 
the byte string {OFh, Olh} in the instruction encoding. 

Table A-8. Opcode Olh ModRM Extensions 


reg Field 

ModRM.r/m Field 

0 

1 

2 

3 

4 

5 

6 

7 

/I 

MONITOR 

(C8) 

MWAIT 

(C9) 







12 

XGETBV 

(DO) 

XSETBV 

(D1) 







13 

VMRUN 

(D8) 

VMMCALL 

(D9) 

VMLOAD 

(DA) 

VMSAVE 

(DB) 

STGI 

(DC) 

CLGI 

(DD) 

SKINIT 

(DE) 

INVLPGA 

(DF) 
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Table A-8. Opcode 01 h ModRM Extensions (continued) 


reg Field 

ModRM.r/m Field 

0 

1 

2 

3 

4 

5 

6 

7 

17 

SWAPGS 

(F8) 

RDTSCP 

(F9) 

MONITORX 

(FA) 

MWAITX 

(FB) 


RDPRU 

(FD) 




ModRM.mod = 11b 


0F_38h and 0F_3Ah Opcode Maps. The 0F_38h and 0F_3Ah opcode maps are used primarily 
to encode the legacy SSE instructions. In legacy tenninology, these maps are presented as three-byte 
opcodes where the first two bytes are {OFh, 38h} and {OFh, 3Ah} respectively. 


In these maps the legacy prefixes F2h and F3h are repurposed to provide additional opcode encoding 
space. In rows [0:E] the legacy prefix 66h is also used to modify the opcode. However, in row F, 66h is 
used as an operand-size override. See the CRC32 instruction as an example. 

The 0F_38h opcode map is presented below in Tables A-9 and A-10. The 0F_3Ah opcode map is 
presented in Tables A-l 1 and A-12. 
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Table A-9. 0F_38h Opcode Map, Low Nibble = [0h:7h] 


Prefix 

Opcode 

xOh 

xlh 

x2h 

x3h 

x4h 

x5h 

x6h 

x7h 

none 

Oxh 

PSHUFB 

Ppb, Qpb 

PHADDW 

Ppi, Qpi 

PHADDD 

Ppj, Qpj 

PHADDSW 

Ppi, Qpi 

PMADDUBSW 

Ppk, Qpk 

PHSUBW 

Ppi, Qpi 

PHSUBD 

Ppj, Qpj 

PHSUBSW 

Ppi, Qpi 

66h 

PSHUFB 

Vpb, Wpb 

PHADDW 

Vpi, Wpi 

PHADDD 

Vpj, Wpj 

PHADDSW 

Vpi, Wpi 

PMADDUBSW 

Vpk, Wpk 

PHSUBW 

Vpi, Wpi 

PHSUBD 

Vpj, Wpj 

PHSUBSW 

Vpi, Wpi 

none 

lxh 









66h 

PBLENDVB 

Vpb, Wpb 




BLENDVPS 

Vps, Wps 

BLENDVPD 

Vpd, Wpd 


PTEST 

Vo, Wo 

none 

2xh 









66h 

PMOVSXBW 

Vpi, Wpk 

PMOVSXBD 

Vpj, Wpk 

PMOVSXBQ 

Vpq, Wpk 

PMOVSXWD 

Vpj, Wpi 

PMOVSXWQ 
Vpq, Wpi 

PMOVSXDQ 

Vpq, Wpj 



none 

3xh 









66h 

PMOVZXBW 

Vpi, Wpk 

PMOVZXBD 

Vpj, Wpk 

PMOVZXBQ 

Vpq, Wpk 

PMOVZXWD 

Vpj, Wpi 

PMOVZXWQ 
Vpq, Wpi 

PMOVZXDQ 

Vpq, Wpj 


PCMPGTQ 

Vpq, Wpq 

none 

4xh 









66h 

PMULLD 

Vpj, Wpj 

PHMINPOSUW 

Vpi, Wpi 








5xh-Exh 









none 

Fxh 

MOVBE 

Gv, Mv 

MOVBE 

Mv, Gv 







F2h 

CRC32 

Gy, Eb 

CRC32 

Gy, Ev 







66h 

and 

F2h 

CRC32 

Gy, Eb 

CRC32 

Gy, Ev 
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Table A-10. 0F_38h Opcode Map, Low Nibble = [8h:Fh] 


Prefix 

Opcode 

x8h 

x9h 

xAh 

xBh 

xCh 

xDh 

xEh 

xFh 



PSIGNB 

PSIGNW 

PSIGND 

PMULHRSW 





none 

Oxh 

Ppk, Qpk 

Ppi, Qpi 

Ppj, Qpj 

Ppi, Qpi 






PSIGNB 

PSIGNW 

PSIGND 

PMULHRSW 





66h 


Vpk, Wpk 

Vpi, Wpi 

Vpj, Wpj 

Vpi, Wpi 











PABSB 

PABSW 

PABSD 


none 

lxh 





Ppk, Qpk 

Ppi, Qpi 

Ppj, Qpj 







PABSB 

PABSW 

PABSD 


66h 






Vpk, Wpk 

Vpi, Wpi 

Vpj, Wpj 


none 

2xh 










PMULDQ 

PCMPEQQ 

MOVNTDQA 

PACKUSDW 





66h 


Vpq, Wpj 

Vpq, Wpq 

Vo, Mo 

Vpi, Wpj 





none 

3xh 










PMINSB 

PMINSD 

PMINUW 

PMINUD 

PMAXSB 

PMAXSD 

PMAXUW 

PMAXUD 

66h 


Vpk, pk 

Vpj, Wpj 

Vpi, Wpi 

Vpj, Wpj 

Vpk, Wpk 

Vpj, Wpj 

Vpi, Wpi 

Vpj, Wpj 


4xh-Cxh 














AESIMC 

AESENC 

AESENCLAST 

AESDEC 

AESDECLAST 

66h 

Dxh 




Vo, Wo 

Vo, Wo 

Vo, Wo 

Vo, Wo 

Vo, Wo 


Exh-Fxh 
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Table A-11. 0F_3Ah Opcode Map, Low Nibble = [0h:7h] 


Prefix 

Opcode 

xOh 

xlh 

x2h 

x3h 

x4h 

x5h 

x6h 

x7h 

n/a 

Oxh 









none 

lxh 









66h 





PEXTRB 

Mb, Vpk, lb 

PEXTRB 

Ry, Vpk, lb 

PEXTRW 

Mw, Vpw, lb 

PEXTRW 

Ry, Vpw, lb 

PEXTRD 

Ed, Vpj, lb 
PEXTRQ 1 

Eq, Vpq, lb 

EXTRACTPS 

Md, Vps, lb 

EXTRACTPS 

Ry, Vps, lb 

none 

2xh 









66h 

PINSRB 

Vpk, Mb, lb 

PINSRB 

Vpk, Rb, lb 

INSERTPS 

Vps, Md, lb 

INSERTPS 

Vps, Uo, lb 

PINSRD 

Vpj, Ed, lb 
PINSRQ 1 

Vpq, Eq, lb 







3xh 









none 

4xh 









66h 

DPPS 

Vps, Wps, lb 

DPPD 

Vpd, Wpd, lb 

MPSADBW 

Vpk, Wpk, lb 


PCLMULQDQ 
Vpq, Wpq, lb 




n/a 

5xh 









none 

6xh 









66h 

PCMPESTRM 

Vo, Wo, lb 

PCMPESTRI 

Vo, Wo, lb 

PCMPISTRM 

Vo, Wo, lb 

PCMPISTRI 

Vo, Wo, lb 






7xh-Exh 









n/a 

Fxh 










Note 1: When REX prefix is present 


Table A-12. 0F_3Ah Opcode Map, Low Nibble = [8h:Fh] 


Prefix 

Opcode 

x8h 

x9h 

xAh 

xBh 

xCh 

xDh 

xEh 

xFh 

none 

Oxh 








PALIGNR 

Ppb, Qpb, lb 

66h 

ROUNDPS 

Vps, Wps, lb 

ROUNDPD 

Vpd, Wpd, lb 

ROUNDSS 

Vss, Wss, lb 

ROUNDSD 

Vsd, Wsd, lb 

BLENDPS 

Vps, Wps, lb 

BLENDPD 

Vpd, Wpd, lb 

PBLENDW 

Vpw, Wpw, lb 

PALIGNR 

Vpb, Wpb, lb 


lxh-Cxh 









66h 

Dxh 








AESKEYGENASSIST 

Vo, Wo, lb 


Fxh 
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A.1.2 3DNow!™ Opcodes 

The 64-bit media instructions include the MMX™ instructions and the AMD 3DNow!™ instructions. 
The MMX instructions are encoded using two opcode bytes, as described in “Secondary Opcode Map” 
on page 458. 

The 3DNow! instructions are encoded using two OFh opcode bytes and an immediate byte that is 
located at the last byte position of the instruction encoding. Thus, the fonnat for 3DNow! instructions 
is: 


OFh OFh [ModRM] [SIB] [displacement] imm8~opcode 

Table A-13 and Table A-14 on page 475 show the immediate byte following the opcode bytes for 
3DNow! instructions. In these tables, rows show the high nibble of the immediate byte, and columns 
show the low nibble of the immediate byte. Table A-13 shows the immediate bytes whose low nibble 
is in the range 0-7h. Table A-14 shows the same for immediate bytes whose low nibble is in the range 
8-Fh. 

Byte values shown as reserved in these tables have implementation-specific functions, which can 
include an invalid-opcode exception. 
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Table A-13. Immediate Byte for 3DNow!™ Opcodes, Low Nibble 0-7h 


Nibble 1 

0 

1 

2 

3 

4 

5 

6 

7 

0 









1 









2 









3 









4 









5 









6 









7 









8 









9 

PFCMPGE 

Pq, Qq 




PFMIN 

Pq, Qq 


PFRCP 

Pq, Qq 

PFRSQRT 

Pq, Qq 

A 

PFCMPGT 

Pq, Qq 




PFMAX 

Pq, Qq 


PFRCPIT1 

Pq, Qq 

PFRSQIT1 

Pq, Qq 

B 

PFCMPEQ 

Pq, Qq 




PFMUL 

Pq, Qq 


PFRCPIT2 

Pq, Qq 

PMULHRW 

Pq, Qq 

C 









D 









E 









F 









Notes: 

1. All 3DNowl™ opcodes consist of two OFh bytes. This table shows the immediate byte for 3DNow! opcodes. Rows 
show the high nibble of the immediate byte. Columns show the low nibble of the immediate byte. 
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Table A-14. Immediate Byte for 3DNow!™ Opcodes, Low Nibble 8-Fh 


Nibble 1 

8 

9 

A 

B 

C 

D 

E 

F 

0 





PI2FW 

Pq, Qq 

PI2FD 

Pq, Qq 



1 





PF2IW 

Pq, Qq 

PF2ID 

Pq, Qq 



2 









3 









4 









5 









6 









7 









8 



PFNACC 

Pq, Qq 




PFPNACC 

Pq, Qq 


9 



PFSUB 

Pq, Qq 




PFADD 

Pq, Qq 


A 



PFSUBR 

Pq, Qq 




PFACC 

Pq, Qq 


B 




PS WAP D 

Pq, Qq 




PAVGUSB 

Pq, Qq 

C 









D 









E 









F 









Notes: 

1. All 3DNowl™ opcodes consist of two OFh bytes. This table shows the immediate byte for 3DNow! opcodes. Rows 
show the high nibble of the immediate byte. Columns show the low nibble of the immediate byte. 
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A.1.3 x87 Encodings 

All x87 instructions begin with an opcode byte in the range D8h to DFh, as shown in Table A-2 on 
page 458. These opcodes are followed by a ModRM byte that further defines the opcode. Table A-15 
shows both the opcode byte and the ModRM byte for each x87 instruction. 

There are two significant ranges for the ModRM byte for x87 opcodes: 00-BFh and CO-FFh. When 
the value of the ModRM byte falls within the first range, 00-BFh, the opcode uses only the reg field to 
further define the opcode. When the value of the ModRM byte falls within the second range, CO-FFh, 
the opcode uses the entire ModRM byte to further define the opcode. 

Byte values shown as reserved or invalid in Table A-15 have implementation-specific functions, 
which can include an invalid-opcode exception. 

The basic instructions FNSTENV, FNSTCW, FNCLEX, FNINIT, FNSAVE, FNSTSW, and FNSTSW 
do not check for possible floating point exceptions before operating. Utility versions of these 
mnemonics are provided that insert an FWAIT (opcode 9B) before the corresponding non-waiting 
instruction. These are FSTENV, FSTCW, FCLEX, FINIT, FSAVE, and FSTSW. For further 
infonnation on wait and non-waiting versions of these instructions, see their corresponding pages in 
Volume 5. 
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Table A-15. x87 Opcodes and ModRM Extensions 



ModRM 

ModRM reg Field 

Opcode 

mod 

Field 

10 

/I 

12 

13 

/4 

15 

16 

17 






00-BF 





111 

FADD 

mem32- 

real 

FMUL 

mem32real 

FCOM 

mem32real 

FCOMP 

mem32real 

FSUB 

mem32reai 

FSUBR 

mem32- 

real 

FDIV 

mem32real 

FDIVR 

mem32real 



CO 

C8 

DO 

D8 

EO 

E8 

FO 

F8 



FADD 

FMUL 

FCOM 

FCOMP 

FSUB 

FSUBR 

FDIV 

FDIVR 



ST(0), 

ST(0) 

ST(0), ST(0) 

ST(0), ST(0) 

ST(0), ST(0) 

ST(0), ST(0) 

ST(0), 

ST(0) 

ST(0), ST(0) 

ST(0), ST(0) 



Cl 

C9 

D1 

D9 

El 

E9 

FI 

F9 



FADD 

FMUL 

FCOM 

FCOMP 

FSUB 

FSUBR 

FDIV 

FDIVR 



ST(0), 

ST(1) 

ST(0), ST(1) 

ST(0), ST(1) 

ST(0), ST(1) 

ST(0), ST(1) 

ST(0), 

ST(1) 

ST(0), ST(1) 

ST(0), ST(1) 



C2 

CA 

D2 

DA 

E2 

EA 

F2 

FA 



FADD 

FMUL 

FCOM 

FCOMP 

FSUB 

FSUBR 

FDIV 

FDIVR 



ST(0), 

ST(2) 

ST(0), ST(2) 

ST(0), ST(2) 

ST(0), ST(2) 

ST(0), ST(2) 

ST(0), 

ST (2) 

ST(0), ST(2) 

ST(0), ST(2) 



C3 

CB 

D3 

DB 

E3 

EB 

F3 

FB 

D8 


FADD 

FMUL 

FCOM 

FCOMP 

FSUB 

FSUBR 

FDIV 

FDIVR 


11 

ST(0), 

ST(3) 

ST(0), ST(3) 

ST(0), ST(3) 

ST(0), ST(3) 

ST(0), ST(3) 

ST(0), 

ST(3) 

ST(0), ST(3) 

ST(0), ST(3) 


C4 

cc 

D4 

DC 

E4 

EC 

F4 

FC 



FADD 

FMUL 

FCOM 

FCOMP 

FSUB 

FSUBR 

FDIV 

FDIVR 



ST(0), 

ST(4) 

ST(0), ST(4) 

ST(0), ST(4) 

ST(0), ST(4) 

ST(0), ST(4) 

ST(0), 

ST(4) 

ST(0), ST(4) 

ST(0), ST(4) 



C5 

CD 

D5 

DD 

E5 

ED 

F5 

FD 



FADD 

FMUL 

FCOM 

FCOMP 

FSUB 

FSUBR 

FDIV 

FDIVR 



ST(0), 

ST(5) 

ST(0), ST(5) 

ST(0), ST(5) 

ST(0), ST(5) 

ST(0), ST(5) 

ST(0), 

ST(5) 

ST(0), ST(5) 

ST(0), ST(5) 



C6 

CE 

D6 

DE 

E6 

EE 

F6 

FE 



FADD 

FMUL 

FCOM 

FCOMP 

FSUB 

FSUBR 

FDIV 

FDIVR 



ST(0), 

ST(6) 

ST(0), ST(6) 

ST(0), ST(6) 

ST(0), ST(6) 

ST(0), ST(6) 

ST(0), 

ST(6) 

ST(0), ST(6) 

ST(0), ST(6) 



C7 

CF 

D7 

DF 

E7 

EF 

F7 

FF 



FADD 

FMUL 

FCOM 

FCOMP 

FSUB 

FSUBR 

FDIV 

FDIVR 



ST(0), 

ST(7) 

ST(0), ST(7) 

ST(0), ST(7) 

ST(0), ST(7) 

ST(0), ST(7) 

ST(0), 

ST (7) 

ST(0), ST(7) 

ST(0), ST(7) 
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Table A-15. x87 Opcodes and ModRM Extensions (continued) 


Opcode 

ModRM 

mod 

Field 

ModRM reg Field 

10 

/I 

12 

/3 

/4 

15 

16 

17 

D9 

111 

FLD 

mem32- 

real 


FST 

mem32real 

0( 

FSTP 

mem32real 

)-BF 

FLDENV 

mem14/28en 

V 

FLDCW 

mem16 

FNSTENV 

mem14/28en 

V 

FNSTCW 

mem16 

11 

CO 

FLD 

ST(0), 

ST(0) 

C8 

FXCH 

ST(0), ST(0) 

DO 

FNOP 

D8 

reserved 

EO 

FCHS 

E8 

FLD1 

FO 

F2XM1 

F8 

FPREM 

Cl 

FLD 

ST(0), 

ST(1) 

C9 

FXCH 

ST(0), ST(1) 

D1 

invalid 

D9 

reserved 

El 

FABS 

E9 

FLDL2T 

FI 

FYL2X 

F9 

FYL2XP1 

C2 

FLD 

ST(0), 

ST(2) 

CA 

FXCH 

ST(0), ST(2) 

D2 

invalid 

DA 

reserved 

E2 

invalid 

EA 

FLDL2E 

F2 

FPTAN 

FA 

FSQRT 

C3 

FLD 

ST(0), 

ST(3) 

CB 

FXCH 

ST(0), ST(3) 

D3 

invalid 

DB 

reserved 

E3 

invalid 

EB 

FLDPI 

F3 

FPATAN 

FB 

FSINCOS 

C4 

FLD 

ST(0), 

ST(4) 

cc 

FXCH 

ST(0), ST(4) 

D4 

invalid 

DC 

reserved 

E4 

FTST 

EC 

FLDLG2 

F4 

FXTRACT 

FC 

FRNDINT 

C5 

FLD 

ST(0), 

ST(5) 

CD 

FXCH 

ST(0), ST(5) 

D5 

invalid 

DD 

reserved 

E5 

FXAM 

ED 

FLDLN2 

F5 

FPREM1 

FD 

FSCALE 

C6 

FLD 

ST(0), 

ST(6) 

CE 

FXCH 

ST(0), ST(6) 

D6 

invalid 

DE 

reserved 

E6 

invalid 

EE 

FLDZ 

F6 

FDECSTP 

FE 

FSIN 

C7 

FLD 

ST(0), 

ST(7) 

CF 

FXCH 

ST(0), ST(7) 

D7 

invalid 

DF 

reserved 

E7 

invalid 

EF 

invalid 

F7 

FINCSTP 

FF 

FCOS 
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Table A-15. x87 Opcodes and ModRM Extensions (continued) 



ModRM 

ModRM reg Field 

Opcode 

mod 

Field 

10 

/I 

12 

/3 

/4 

15 

16 

17 






00-BF 





ill 

FIADD 

FIMUL 

FICOM 

FICOMP 

FISUB 

FISUBR 

FIDIV 

FIDIVR 



mem32int 

mem32int 

mem32int 

mem32int 

mem32int 

mem32int 

mem32int 

mem32int 



CO 

C8 

DO 

D8 

EO 

E8 

FO 

F8 



FCMOVB 

FCMOVE 

FCMOVBE 

FCMOVU 

invalid 

invalid 

invalid 

invalid 



ST(0), 

ST(0) 

ST(0), ST(0) 

ST(0), ST(0) 

ST(0), ST(0) 







Cl 

C9 

D1 

D9 

El 

E9 

FI 

F9 



FCMOVB 

FCMOVE 

FCMOVBE 

FCMOVU 

invalid 

FUCOMPP 

invalid 

invalid 



ST(0), 

ST(1) 

ST(0), ST(1) 

ST(0), ST(1) 

ST(0), ST(1) 







C2 

CA 

D2 

DA 

E2 

EA 

F2 

FA 



FCMOVB 

FCMOVE 

FCMOVBE 

FCMOVU 

invalid 

invalid 

invalid 

invalid 



ST(0), 

ST(2) 

ST(0), ST(2) 

ST(0), ST(2) 

ST(0), ST(2) 







C3 

CB 

D3 

DB 

E3 

EB 

F3 

FB 



FCMOVB 

FCMOVE 

FCMOVBE 

FCMOVU 

invalid 

invalid 

invalid 

invalid 

DA 

11 

ST(0), 

ST(3) 

ST(0), ST(3) 

ST(0), ST(3) 

ST(0), ST(3) 






C4 

cc 

D4 

DC 

E4 

EC 

F4 

FC 



FCMOVB 

FCMOVE 

FCMOVBE 

FCMOVU 

invalid 

invalid 

invalid 

invalid 



ST(0), 

ST(4) 

ST(0), ST(4) 

ST(0), ST(4) 

ST(0), ST(4) 







C5 

CD 

D5 

DD 

E5 

ED 

F5 

FD 



FCMOVB 

FCMOVE 

FCMOVBE 

FCMOVU 

invalid 

invalid 

invalid 

invalid 



ST(0), 

ST(5) 

ST(0), ST(5) 

ST(0), ST(5) 

ST(0), ST(5) 







C6 

CE 

D6 

DE 

E6 

EE 

F6 

FE 



FCMOVB 

FCMOVE 

FCMOVBE 

FCMOVU 

invalid 

invalid 

invalid 

invalid 



ST(0), 

ST(6) 

ST(0), ST(6) 

ST(0), ST(6) 

ST(0), ST(6) 







C7 

CF 

D7 

DF 

E7 

EF 

F7 

FF 



FCMOVB 

FCMOVE 

FCMOVBE 

FCMOVU 

invalid 

invalid 

invalid 

invalid 



ST(0), 

ST(7) 

ST(0), ST(7) 

ST(0), ST(7) 

ST(0), ST(7) 
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Table A-15. x87 Opcodes and ModRM Extensions (continued) 



ModRM 

ModRM reg Field 

Opcode 

mod 

Field 

10 

/I 

12 

/3 

/4 

15 

16 

17 






00-BF 





111 

FILD 

FISTTP 

FIST 

FISTP 

invalid 

FLD 

invalid 

FSTP 



mem32int 

mem32int 

mem32int 

mem32int 


mem80- 

real 


mem80real 



CO 

C8 

DO 

D8 

EO 

E8 

F0 

F8 



FCMOVNB 

FCMOVNE 

FCMOVNB 

E 

FCMOVNU 

reserved 

FUCOMI 

FCOMI 

invalid 



ST(0), 

ST(0) 

ST(0), ST(0) 

ST(0), ST(0) 

ST(0), ST(0) 


ST(0), 

ST (0) 

ST(0), ST(0) 




Cl 

C9 

D1 

D9 

El 

E9 

FI 

F9 



FCMOVNB 

FCMOVNE 

FCMOVNB 

E 

FCMOVNU 

reserved 

FUCOMI 

FCOMI 

invalid 



ST(0), 

ST(1) 

ST(0), ST(1) 

ST(0), ST(1) 

ST(0), ST(1) 


ST(0), 

ST(1) 

ST(0), ST(1) 




C2 

CA 

D2 

DA 

E2 

EA 

F2 

FA 



FCMOVNB 

FCMOVNE 

FCMOVNB 

E 

FCMOVNU 

FNCLEX 

FUCOMI 

FCOMI 

invalid 



ST(0), 

ST(2) 

ST(0), ST(2) 

ST(0), ST(2) 

ST(0), ST(2) 


ST(0), 

ST (2) 

ST(0), ST(2) 




C3 

CB 

D3 

DB 

E3 

EB 

F3 

FB 



FCMOVNB 

FCMOVNE 

FCMOVNB 

FCMOVNU 

FNINIT 

FUCOMI 

FCOMI 

invalid 

DB 


ST(0), 

ST(3) 


t 



ST(0), 

ST(3) 




11 

ST(0), ST(3) 

ST(0), ST(3) 

ST(0), ST(3) 


ST(0), ST(3) 



C4 

cc 

D4 

DC 

E4 

EC 

F4 

FC 



FCMOVNB 

FCMOVNE 

FCMOVNB 

E 

FCMOVNU 

reserved 

FUCOMI 

FCOMI 

invalid 



ST(0), 

ST(4) 

ST(0), ST(4) 

ST(0), ST(4) 

ST(0), ST(4) 


ST(0), 

ST(4) 

ST(0), ST(4) 




C5 

CD 

D5 

DD 

E5 

ED 

F5 

FD 



FCMOVNB 

FCMOVNE 

FCMOVNB 

E 

FCMOVNU 

invalid 

FUCOMI 

FCOMI 

invalid 



ST(0), 

ST(5) 

ST(0), ST(5) 

ST(0), ST(5) 

ST(0), ST(5) 


ST(0), 

ST(5) 

ST(0), ST(5) 




C6 

CE 

D6 

DE 

E6 

EE 

F6 

FE 



FCMOVNB 

FCMOVNE 

FCMOVNB 

E 

FCMOVNU 

invalid 

FUCOMI 

FCOMI 

invalid 



ST(0), 

ST(6) 

ST(0), ST(6) 

ST(0), ST(6) 

ST(0), ST(6) 


ST(0), 

ST(6) 

ST(0), ST(6) 




C7 

CF 

D7 

DF 

E7 

EF 

F7 

FF 



FCMOVNB 

FCMOVNE 

FCMOVNB 

E 

FCMOVNU 

invalid 

FUCOMI 

FCOMI 

invalid 



ST(0), 

ST(7) 

ST(0), ST(7) 

ST(0), ST(7) 

ST(0), ST(7) 


ST(0), 

ST(7) 

ST(0), ST(7) 
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Table A-15. x87 Opcodes and ModRM Extensions (continued) 



ModRM 

ModRM reg Field 

Opcode 

mod 

Field 

10 

/I 

12 

/3 

/4 

15 

16 

17 






00-BF 





111 

FADD 

mem64- 

real 

FMUL 

mem64real 

FCOM 

mem64real 

FCOMP 

mem64real 

FSUB 

mem64reai 

FSUBR 

mem64- 

real 

FDIV 

mem64real 

FDIVR 

mem64real 



CO 

C8 

DO 

D8 

EO 

E8 

F0 

F8 



FADD 

FMUL 

reserved 

reserved 

FSUBR 

FSUB 

FDIVR 

FDIV 



ST(0), 

ST(0) 

ST(0), ST(0) 



ST(0), ST(0) 

ST(0), 

ST (0) 

ST(0), ST(0) 

ST(0), ST(0) 



Cl 

C9 

D1 

D9 

El 

E9 

FI 

F9 



FADD 

FMUL 

reserved 

reserved 

FSUBR 

FSUB 

FDIVR 

FDIV 



ST(1). 

ST(0) 

ST(1), ST(0) 



ST(1), ST(0) 

ST(1), 

ST(0) 

ST(1), ST(0) 

ST(1), ST(0) 



C2 

CA 

D2 

DA 

E2 

EA 

F2 

FA 



FADD 

FMUL 

reserved 

reserved 

FSUBR 

FSUB 

FDIVR 

FDIV 



ST(2), 

ST(0) 

ST(2), ST(0) 



ST(2), ST(0) 

ST(2), 

ST (0) 

ST(2), ST(0) 

ST(2), ST(0) 



C3 

CB 

D3 

DB 

E3 

EB 

F3 

FB 

DC 


FADD 

FMUL 

reserved 

reserved 

FSUBR 

FSUB 

FDIVR 

FDIV 


11 

ST(3), 

ST(0) 

ST(3), ST(0) 



ST(3), ST(0) 

ST(3), 

ST(0) 

ST(3), ST(0) 

ST(3), ST(0) 


C4 

cc 

D4 

DC 

E4 

EC 

F4 

FC 



FADD 

FMUL 

reserved 

reserved 

FSUBR 

FSUB 

FDIVR 

FDIV 



ST(4), 

ST(0) 

ST(4), ST(0) 



ST(4), ST(0) 

ST(4), 

ST (0) 

ST(4), ST(0) 

ST(4), ST(0) 



C5 

CD 

D5 

DD 

E5 

ED 

F5 

FD 



FADD 

FMUL 

reserved 

reserved 

FSUBR 

FSUB 

FDIVR 

FDIV 



ST(5), 

ST(0) 

ST(5), ST(0) 



ST(5), ST(0) 

ST(5), 

ST (0) 

ST(5), ST(0) 

ST(5), ST(0) 



C6 

CE 

D6 

DE 

E6 

EE 

F6 

FE 



FADD 

FMUL 

reserved 

reserved 

FSUBR 

FSUB 

FDIVR 

FDIV 



ST(6), 

ST(0) 

ST(6), ST(0) 



ST(6), ST(0) 

ST(6), 

ST (0) 

ST(6), ST(0) 

ST(6), ST(0) 



C7 

CF 

D7 

DF 

E7 

EF 

F7 

FF 



FADD 

FMUL 

reserved 

reserved 

FSUBR 

FSUB 

FDIVR 

FDIV 



ST(7), 

ST(0) 

ST(7), ST(0) 



ST(7), ST(0) 

ST(7), 

ST (0) 

ST(7), ST(0) 

ST(7), ST(0) 
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Table A-15. x87 Opcodes and ModRM Extensions (continued) 


Opcode 

ModRM 

mod 

Field 

ModRM reg Field 

10 

/I 

12 

/3 

/4 

15 

16 

17 

DD 

111 

FLD 

mem64- 

real 

FISTTP 

mem64int 

FST 

mem64real 

0( 

FSTP 

mem64real 

)-BF 

FRSTOR 

mem98/108e 

nv 

invalid 

FNSAVE 

mem98/108e 

nv 

FNSTSW 

mem16 

11 

CO 

FFREE 

ST(0) 

C8 

reserved 

DO 

FST 

ST(0) 

D8 

FSTP 

ST(0) 

EO 

FUCOM 

ST(0), ST(0) 

E8 

FUCOMP 

ST(0) 

FO 

invalid 

F8 

invalid 

Cl 

FFREE 

ST(1) 

C9 

reserved 

D1 

FST 

ST(1) 

D9 

FSTP 

ST(1) 

El 

FUCOM 

ST(1), ST(0) 

E9 

FUCOMP 

ST(1) 

FI 

invalid 

F9 

invalid 

C2 

FFREE 

ST(2) 

CA 

reserved 

D2 

FST 

ST(2) 

DA 

FSTP 

ST(2) 

E2 

FUCOM 

ST(2), ST(0) 

EA 

FUCOMP 

ST(2) 

F2 

invalid 

FA 

invalid 

C3 

FFREE 

ST(3) 

CB 

reserved 

D3 

FST 

ST(3) 

DB 

FSTP 

ST(3) 

E3 

FUCOM 

ST(3), ST(0) 

EB 

FUCOMP 

ST(3) 

F3 

invalid 

FB 

invalid 

C4 

FFREE 

ST(4) 

CC 

reserved 

D4 

FST 

ST(4) 

DC 

FSTP 

ST(4) 

E4 

FUCOM 

ST(4), ST(0) 

EC 

FUCOMP 

ST(4) 

F4 

invalid 

FC 

invalid 

C5 

FFREE 

ST(5) 

CD 

reserved 

D5 

FST 

ST(5) 

DD 

FSTP 

ST(5) 

E5 

FUCOM 

ST(5), ST(0) 

ED 

FUCOMP 

ST(5) 

F5 

invalid 

FD 

invalid 

C6 

FFREE 

ST(6) 

CE 

reserved 

D6 

FST 

ST(6) 

DE 

FSTP 

ST(6) 

E6 

FUCOM 

ST(6), ST(0) 

EE 

FUCOMP 

ST(6) 

F6 

invalid 

FE 

invalid 

C7 

FFREE 

ST(7) 

CF 

reserved 

D7 

FST 

ST(7) 

DF 

FSTP 

ST(7) 

E7 

FUCOM 

ST(7), ST(0) 

EF 

FUCOMP 

ST(7) 

F7 

invalid 

FF 

invalid 
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Table A-15. x87 Opcodes and ModRM Extensions (continued) 



ModRM 

ModRM reg Field 

Opcode 

mod 

Field 

10 

/I 

12 

/3 

/4 

15 

16 

17 






00-BF 





ill 

FIADD 

FIMUL 

FICOM 

FICOMP 

FISUB 

FISUBR 

FIDIV 

FIDIVR 



mem16int 

mem16int 

mem16int 

mem16int 

mem16int 

mem16int 

mem16int 

mem16int 



CO 

C8 

DO 

D8 

EO 

E8 

F0 

F8 



FADDP 

FMULP 

reserved 

invalid 

FSUBRP 

FSUBP 

FDIVRP 

FDIVP 



ST(0), 

ST(0) 

ST(0), ST(0) 



ST(0), ST(0) 

ST(0), 

ST (0) 

ST(0), ST(0) 

ST(0), ST(0) 



Cl 

C9 

D1 

D9 

El 

E9 

FI 

F9 



FADDP 

FMULP 

reserved 

FCOMPP 

FSUBRP 

FSUBP 

FDIVRP 

FDIVP 



ST(1). 

ST(0) 

ST(1), ST(0) 



ST(1), ST(0) 

ST(1). 

ST (0) 

ST(1), ST(0) 

ST(1), ST(0) 



C2 

CA 

D2 

DA 

E2 

EA 

F2 

FA 



FADDP 

FMULP 

reserved 

invalid 

FSUBRP 

FSUBP 

FDIVRP 

FDIVP 



ST(2), 

ST(0) 

ST(2), ST(0) 



ST(2), ST(0) 

ST(2), 

ST (0) 

ST(2), ST(0) 

ST(2), ST(0) 



C3 

CB 

D3 

DB 

E3 

EB 

F3 

FB 



FADDP 

FMULP 

reserved 

invalid 

FSUBRP 

FSUBP 

FDIVRP 

FDIVP 

DE 

11 

ST(3), 

ST(0) 

ST(3), ST(0) 



ST(3), ST(0) 

ST(3), 

ST (0) 

ST(3), ST(0) 

ST(3), ST(0) 


C4 

cc 

D4 

DC 

E4 

EC 

F4 

FC 



FADDP 

FMULP 

reserved 

invalid 

FSUBRP 

FSUBP 

FDIVRP 

FDIVP 



ST(4), 

ST(0) 

ST(4), ST(0) 



ST(4), ST(0) 

ST(4), 

ST (0) 

ST(4), ST(0) 

ST(4), ST(0) 



C5 

CD 

D5 

DD 

E5 

ED 

F5 

FD 



FADDP 

FMULP 

reserved 

invalid 

FSUBRP 

FSUBP 

FDIVRP 

FDIVP 



ST(5), 

ST(0) 

ST(5), ST(0) 



ST(5), ST(0) 

ST(5), 

ST(0) 

ST(5), ST(0) 

ST(5), ST(0) 



C6 

CE 

D6 

DE 

E6 

EE 

F6 

FE 



FADDP 

FMULP 

reserved 

invalid 

FSUBRP 

FSUBP 

FDIVRP 

FDIVP 



ST(6), 

ST(0) 

ST(6), ST(0) 



ST(6), ST(0) 

ST(6), 

ST (0) 

ST(6), ST(0) 

ST(6), ST(0) 



C7 

CF 

D7 

DF 

E7 

EF 

F7 

FF 



FADDP 

FMULP 

reserved 

invalid 

FSUBRP 

FSUBP 

FDIVRP 

FDIVP 



ST(7), 

ST(0) 

ST(7), ST(0) 



ST(7), ST(0) 

ST(7), 

ST(0) 

ST(7), ST(0) 

ST(7), ST(0) 
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Table A-15. x87 Opcodes and ModRM Extensions (continued) 


Opcode 

ModRM 

mod 

Field 

ModRM reg Field 

10 

/I 

12 

/3 

/4 

15 

16 

17 

DF 

ill 

FILD 

mem16int 

FISTTP 

mem16int 

FIST 

mem16int 

0( 

FISTP 

mem16int 

)-BF 

FBLD 

mem80dec 

FILD 

mem64int 

FBSTP 

mem80dec 

FISTP 

mem64int 

11 

CO 

reserved 

C8 

reserved 

DO 

reserved 

D8 

reserved 

EO 

FNSTSW 

AX 

E8 

FUCOMIP 

ST(0), 

ST (0) 

F0 

FCOMIP 

ST(0), ST(0) 

F8 

invalid 

Cl 

reserved 

C9 

reserved 

D1 

reserved 

D9 

reserved 

El 

invalid 

E9 

FUCOMIP 

ST(0), 

ST(1) 

FI 

FCOMIP 

ST(0), ST(1) 

F9 

invalid 

C2 

reserved 

CA 

reserved 

D2 

reserved 

DA 

reserved 

E2 

invalid 

EA 

FUCOMIP 

ST(0), 

ST (2) 

F2 

FCOMIP 

ST(0), ST(2) 

FA 

invalid 

C3 

reserved 

CB 

reserved 

D3 

reserved 

DB 

reserved 

E3 

invalid 

EB 

FUCOMIP 

ST(0), 

ST (3) 

F3 

FCOMIP 

ST(0), ST(3) 

FB 

invalid 

C4 

reserved 

CC 

reserved 

D4 

reserved 

DC 

reserved 

E4 

invalid 

EC 

FUCOMIP 

ST(0), 

ST(4) 

F4 

FCOMIP 

ST(0), ST(4) 

FC 

invalid 

C5 

reserved 

CD 

reserved 

D5 

reserved 

DD 

reserved 

E5 

invalid 

ED 

FUCOMIP 

ST(0), 

ST(5) 

F5 

FCOMIP 

ST(0), ST(5) 

FD 

invalid 

C6 

reserved 

CE 

reserved 

D6 

reserved 

DE 

reserved 

E6 

invalid 

EE 

FUCOMIP 

ST(0), 

ST (6) 

F6 

FCOMIP 

ST(0), ST(6) 

FE 

invalid 

C7 

reserved 

CF 

reserved 

D7 

reserved 

DF 

reserved 

E7 

invalid 

EF 

FUCOMIP 

ST(0), 

ST(7) 

F7 

FCOMIP 

ST(0), ST(7) 

FF 

invalid 
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A.1.4 rFLAGS Condition Codes forx87 Opcodes 

Table A-16 shows the rFLAGS condition codes specified by the opcode and ModRM bytes of the 
FCMOVcc instructions. 


Table A-16. rFLAGS Condition Codes for FCMOVcc 


Opcode 

(hex) 

ModRM 

mod 

Field 

ModRM 

reg 

Field 

rFLAGS Value 

cc Mnemonic 

Condition 

DA 

11 

000 

CF = 1 

B 

Below 

001 

ZF = 1 

E 

Equal 

010 

CF = 1 orZF = 1 

BE 

Below or Equal 

Oil 

PF = 1 

U 

Unordered 

DB 

000 

CF = 0 

NB 

Not Below 

001 

ZF = 0 

NE 

Not Equal 

010 

CF = 0 and ZF = 0 

NBE 

Not Below or Equal 

Oil 

PF = 0 

NU 

Not Unordered 


A.1.5 Extended Instruction Opcode Maps 

The following sections present the VEX and the XOP extended instruction opcode maps. The 
VEX.map_select field of the three-byte VEX encoding escape sequence selects VEX opcode maps: 
Olh, 02h, or 03h. The two-byte VEX encoding escape sequence implicitly selects the VEX map Olh. 

The XOP.mapselect field selects between the three XOP maps: 08h, 09h or OAh. 

VEX Opcode Maps. Tables A-17 - A-23 below present the VEX opcode maps and Table A-24 on 
page 493 presents the VEX opcode groups. 
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Table A-17. VEX Opcode Map 1, Low Nibble = [0h:7h] 


Opcode 

xOh 

xlh 

x2h 

x3h 

x4h 

x5h 

x6h 

x7h 

00h 










VMOVUPS 2 
Vpsx, Wpsx 

VMOVUPS 2 
Wpsx, Vpsx 

VMOVLPS 

Vps, Hps, Mq 

VMOVHLPS 

Vps, Hps, Ups 

VMOVLPS 

Mq, Vps 

VUNPCKLPS 2 
Vpsx, Hpsx, Wpsx 

VUNPCKHPS 2 
Vpsx, Hpsx, Wpsx 

VMOVHPS 

Vps, Hps, Mq 

VMOVLHPS 

Vps, Hps, Ups 

VMOVHPS 

Mq, Vps 


VMOVUPD 2 
Vpdx, Wpdx 

VMOVUPD 2 
Wpdx, Vpdx 

VMOVLPD 

Vo, Ho, Mq 

VMOVLPD 

Mq, Vo 

VUNPCKLPD 2 
Vpdx, Hpdx, Wpdx 

VUNPCKHPD 2 
Vpdx, Hpdx, Wpdx 

VMOVHPD 

Vpd, Hpd, Mq 

VMOVHPD 

Mq, Vpd 

lxh 

VMOVSS 3 

Vss, Md 

VMOVSS 

Vss, Hss, Uss 

VMOVSS 3 

Md, Vss 

VMOVSS 

Uss, Hss, Vss 

VMOVSLDUP 2 
Vpsx, Wpsx 




VMOVSHDUP 2 
Vpsx, Wpsx 



VMOVSD 3 

Vsd, Mq 

VMOVSD 

Vsd, Hsd, Usd 

VMOVSD 3 

Mq, Vsd 

VMOVSD 

Usd, Hsd, Vsd 

VMOVDDUP 

Vo, Wq (1=0) 
Vdo, Wdo (L=l) 






2xh-4xh 










VMOVMSKPS 2 

Gy, Upsx 

VSQRTPS 2 
Vpsx, Wpsx 

VRSQRTPS 2 
Vpsx, Wpsx 

VRCPPS 2 

Vpsx, Wpsx 

VANDPS 2 

Vpsx, Hpsx, Wpsx 

VANDNPS 2 
Vpsx, Hpsx, Wpsx 

VORPS 2 

Vpsx, Hpsx, Wpsx 

VXORPS 2 

Vpsx, Hpsx, Wpsx 

5xh 

VMOVMSKPD 2 

Gy, Updx 

VSQRTPD 2 
Vpdx, Wpdx 



VANDPD 2 

Vpdx, Hpdx, Wpdx 

VANDNPD 2 
Vpdx, Hpdx, Wpdx 

VORPD 2 

Vpdx, Hpdx, Wpdx 

VXORPD 2 

Vpdx, Hpdx, Wpdx 


VSQRTSS 3 

Vo, Ho, Wss 

VRSQRTSS 3 

Vo, Ho, Wss 

VRCPSS 3 

Vo, Ho, Wss 







VSQRTSD 3 

Vo, Ho, Wsd 







6xh 









VPUNPCKLBW 2 
Vpbx, Hpbx, Wpbx 

VPUNPCKLWD 2 
Vpwx, Hpwx, Wpwx 

VPUNPCKLDQ 2 
Vpdwx, Hpdwx, 
Wpdwx 

VPACKSSWB 2 
Vpkx, Hpix, Wpix 

VPCMPGTB 2 
Vpbx, Hpkx, Wpkx 

VPCMPGTW 2 
Vpwx, Hpix, Wpix 

VPCMPGTD 2 
Vpdwx, Hpjx, Wpjx 

VPACKUSWB 2 
Vpkx, Hpix, Wpix 









VZEROUPPER (L=0) 
VZEROALL (L=l) 

7xh 

VPSHUFD 2 
Vpdwx, Wpdwx, lb 

VEX group #12 

VEX group #13 

VEX group #14 

VPCMPEQB 2 
Vpbx, Hpkx, Wpkx 

VPCMPEQW 2 
Vpwx, Hpix, Wpix 

VPCMPEQD 2 
Vpdwx, Hpjx, Wpjx 


VPSHUFHW 2 
Vpwx, Wpwx, lb 









VPSHUFLW 2 
Vpwx, Wpwx, lb 








8xh-Bxh 












VCMPccPS 1 
Vpdw, Hps, Wps, 
lb 




VSHUFPS 2 

Vpsx, Hpsx, Wpsx, 

lb 


Cxh 



VCMPccPD 1 
Vpqw, Hpd, Wpd, 
lb 


VPINSRW 

Vpw, Hpw, Mw, lb 
Vpw, Hpw, Rd, lb 

VPEXTRW 

Gw, Upw, lb 

VSHUFPD 2 
Vpdx, Hpdx, Wpdx, 
lb 




VCMPccSS 1 

Vd, Hss, Wss, lb 









VCMPccSD 1 

Vq, Hsd, Wsd, lb 







Note 1: The condition codes are: EQ, LT, LE, UNORD, NEQ, NLT, NLE, and ORD; encoded as [00:07h] using lb. 

VEX encoding adds: EQ_UQ, NGE, NGT, FALSE, NEQ OQ, GE, GT, TRUE [08:0Fh]; 

EQ OS, LT OQ, LE_OQ, UNORD_S, NEQ_US, NLT UQ, NLE UQ, ORD_S [10h:17h]; and 
EQ_US, NGE_UQ, NGT UQ, FALSE_OS, NEQ_OS, GE OQ, GT OQ, TRUE_US [18:1 Fh], 

Note 2: Supports both 128 bit and 256 bit vector sizes. Vector size is specified using the VEX.L bit. When L = 0, size is 128 bits; when L = 1, size is 256 bits. 
Note 3: Operands are scalars. VEX.L bit is ignored. _ 
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Table A-18. VEX Opcode Map 1, Low Nibble = [Oh:7h] Continued 


VEX.pp 

Opcode 

xOh 

xlh 

x2h 

x3h 

x4h 

x5h 

x6h 

x7h 

00 

Dxh 









01 

VADDSUBPD 2 
Vpdx, Hpdx, Wpdx 

VPSRLW 2 
Vpwx, Hpwx, Wx 

VPSRLD 2 

Vpdwx, Hpdwx, Wx 

VPSRLQ 2 

Vpqwx, Hpqwx, Wx 

VPADDQ 2 

Vpq, Hpq, Wpq 

VPMULLW 2 
Vpix, Hpix, Wpix 

VMOVQ 

Wq, Vq 
(VEX.L=1) 

VPMOVMSKB 2 

Gy, Upbx 

10 









11 

VADDSUBPS 2 
Vpsx, Hpsx, Wpsx 








00 

Exh 









01 

VPAVGB 2 

Vpkx, Hpkx, Wpkx 

VPSRAW 2 
Vpwx, Hpwx, Wx 

VPSRAD 2 

Vpdwx, Hpdwx, Wx 

VPAVGW 2 

Vpix, Hpix, Wpix 

VPMULHUW 2 
Vpi, Hpi, Wpi 

VPMULHW 

Vpi, Hpi, Wpi 

VCVTTPD2DQ 2 
Vpjx, Wpdx 

VMOVNTDQ 

Mo, Vo (L=0) 
Mdo, Vdo (L=l) 

10 







VCVTDQ2PD 2 
Vpdx, Wpjx 


11 







VCVTPD2DQ 2 
Vpjx, Wpdx 


00 

Fxh 









01 


VPSLLW 2 

Vpwx, Hpwx, Wo.qx 

VPSLLD 2 
Vpdwx, Hpdwx, 
Wo.qx 

VPSLLQ 2 
Vpqwx, Hpqwx, 
Wo.qx 

VPMULUDQ 2 
Vpqx, Hpjx, Wpjx 

VPMADDWD 2 
Vpjx, Hpix, Wpix 

VPSADBW 2 
Vpix, Hpkx, Wpkx 

VMASKMOVDQU 

Vpb, Upb 

10 









11 

VLDDQU 

Vo, Mo (L=0) 
Vdo, Mdo (L=l) 









Note 1: The condition codes are: EQ, LT, LE, UNORD, NEQ, NLT, NLE, and ORD; encoded as [00:07h] using lb. 


VEX encoding adds: EQJJQ, NGE, NGT, FALSE, NEQ_OQ, GE, GT, TRUE [08:0Fh]; 

EQ OS, LT_OQ, LE_OQ, UNORD_S, NEQ_US, NLT_UQ, NLE_UQ, ORD_S [10h:17h]; and 
EQ_US, NGE_UQ, NGT_UQ, FALSE_OS, NEQ_OS, GE_OQ, GT_OQ, TRUE_US [18:1 Fh], 

Note 2: Supports both 128 bit and 256 bit vector sizes. Vector size is specified using the VEX.L bit. When L = 0, size is 128 bits; when L = 1, size is 256 bits. 
Note 3: Operands are scalars. VEX.L bit is ignored. _ 
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Table A-19. VEX Opcode Map 1, Low Nibble = [8h:Fh] 


VEX.pp 

Opcode 

x8h 

x9h 

xAh 

xBh 

xCh 

xDh 

xEh 

xFh 


Oxh-lxh 









00 


VMOVAPS 1 
Vpsx, Wpsx 

VMOVAPS 1 
Wpsx, Vpsx 


VMOVNTPS 1 
Mpsx, Vpsx 



VUCOMISS 2 

Vss, Wss 

VCOMISS 2 

Vss, Wss 

01 

2xh 

VMOVAPD 1 
Vpdx, Wpdx 

VMOVAPD 1 
Wpdx, Vpdx 


VMOVNTPD 1 
Mpdx, Vpdx 



VUCOMISD 2 
Vsd, Wsd 

VCOMISD 2 

Vsd, Wsd 

10 



VCVTSI2SS 2 

Vo, Ho, Ey 


VCVTTSS2SI 2 

Gy, Wss 

VCVTSS2SI 2 

Gy, Wss 



11 




VCVTSI2SD 2 

Vo, Ho, Ey 


VCVTTSD2SI 2 

Gy, Wsd 

VCVTSD2SI 2 

Gy, Wsd 




3xh-4xh 









00 


VADDPS 1 

Vpsx, Hpsx, Wpsx 

VMULPS 1 

Vpsx, Hpsx, Wpsx 

VCVTPS2PD 1 
Vpdx, Wpsx 

VCVTDQ2PS 1 
Vpsx, Wpjx 

VSUBPS 1 

Vpsx, Hpsx, Wpsx 

VMINPS 1 

Vpsx, Hpsx, Wpsx 

VDIVPS 1 

Vpsx, Hpsx, Wpsx 

VMAXPS 1 

Vpsx, Hpsx, Wpsx 

01 

5xh 

VADDPD 1 

Vpdx, Hpdx, Wpdx 

VMULPD 1 
Vpdx, Hpdx, Wpdx 

VCVTPD2PS 1 
Vpsx, Wpdx 

VCVTPS2DQ 1 
Vpjx, Wpsx 

VSUBPD 1 

Vpdx, Hpdx, Wpdx 

VMINPD 1 

Vpdx, Hpdx, Wpdx 

VDIVPD 1 

Vpdx, Hpdx, Wpdx 

VMAXPD 1 
Vpdx, Hpdx, Wpdx 

10 

VADDSS 2 

Vss, Hss, Wss 

VMULSS 2 

Vss, Hss, Wss 

VCVTSS2SD 2 

Vo, Ho, Wss 

VCVTTPS2DQ 1 
Vpjx, Wpsx 

VSUBSS 2 

Vss, Hss, Wss 

VMINSS 2 

Vss, Hss, Wss 

VDIVSS 2 

Vss, Hss, Wss 

VMAXSS 2 

Vss, Hss, Wss 

11 


VADDSD 2 

Vsd, Hsd, Wsd 

VMULSD 2 

Vsd, Hsd, Wsd 

VCVTSD2SS 2 

Vo, Ho, Wsd 


VSUBSD 2 

Vsd, Hsd, Wsd 

VMINSD 2 

Vsd, Hsd, Wsd 

VDIVSD 2 

Vsd, Hsd, Wsd 

VMAXSD 2 

Vsd, Hsd, Wsd 

00 










01 

6xh 

VPUNPCKHBW 1 
Vpbx, Hpbx, Wpbx 

VPUNPCKHWD 1 
Vpwx, Hpwx, Wpwx 

VPUNPCKHDQ 1 
Vpdwx, Hpdwx, 
Wpdwx 

VPACKSSDW 1 
Vpix, Hpjx, Wpjx 

VPUNPCKLQDQ 1 
Vpqwx, Hpqwx, 
Wpqwx 

VPUNPCKHQDQ 1 
Vpqwx, Hpqwx, 
Wpqwx 

VMOVD VMOVQ 

Vo, Ey 
(VEX.L=0) 

VMOVDQA 1 
Vpqwx, Wpqwx 

10 









VMOVDQU 1 
Vpqwx, Wpqwx 

11 










00 










01 






VHADDPD 1 
Vpdx, Hpdx, Wpdx 

VHSUBPD 1 
Vpdx, Hpdx, Wpdx 

VMOVD VMOVQ 

Ey, Vo 
(VEX.L=1) 

VMOVDQA 1 
Wpqwx, Vpqwx 

10 

7xh 







VMOVQ 

Vq, Wq 
(VEX.L=0) 

VMOVDQU 1 
Wpqwx, Vpqwx 

11 






VHADDPS 1 
Vpsx, Hpsx, Wpsx 

VHSUBPS 1 
Vpsx, Hpsx, Wpsx 




8xh-9xh 









n/a 

Axh 







VEX group #15 



Bxh-Cxh 









00 










01 

Dxh 

VPSUBUSB 1 
Vpkx, Hpkx, Wpkx 

VPSUBUSW 1 
Vpix, Hpix, Wpix 

VPMINUB 1 
Vpkx, Hpkx, Wpkx 

VPAND 1 

Vx, Hx, Wx 

VPADDUSB 1 
Vpkx, Hpkx, Wpkx 

VPADDUSW 1 
Vpix, Hpix, Wpix 

VPMAXUB 1 
Vpkx, Hpkx, Wpkx 

VPANDN 1 

Vx, Hx, Wx 

00 










01 

Exh 

VPSUBSB 1 
Vpkx, Hpkx, Wpkx 

VPSUBSW 1 

Vpix, Hpix, Wpix 

VPMINSW 1 
Vpix, Hpix, Wpix 

VPOR 1 

Vx, Hx, Wx 

VPADDSB 1 
Vpkx, Hpkx, Wpkx 

VPADDSW 1 
Vpix, Hpix, Wpix 

VPMAXSW 1 
Vpix, Hpix, Wpix 

VPXOR 1 

Vx, Hx, Wx 

00 










01 

Fxh 

VPSUBB 1 

Vpkx, Hpkx, Wpkx 

VPSUBW 1 

Vpix, Hpix, Wpix 

VPSUBD 1 

Vpxj, Hpjx, Wpjx 

VPSUBQ 1 

Vpqx, Hpqx, Wpqx 

VPADDB 1 

Vpkx, Hpkx, Wpkx 

VPADDW 1 

Vpix, Hpix, Wpix 

VPADDD 1 

Vpjx, Hpjx, Wpjx 



Note 1: Supports both 128 bit and 256 bit vector sizes. Vector size is specified using the VEX.L bit. When L = 0, size is 128 bits; when L = 1, size is 256 bits. 
Note 2: Operands are scalars. VEX.L bit is ignored. _ 
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Table A-20. VEX Opcode Map 2, Low Nibble = [0h:7h] 


VEX.pp 

Opcode 

xOh 

xlh 

x2h 

x3h 

x4h 

x5h 

x6h 

x7h 

01 

Oxh 

VPSHUFB 1 

Vpbx, Hpbx, Wpbx 

VPHADDW 1 

Vpix, Hpix, Wpix 

VPHADDD 1 

Vpjx, Hpjx, Wpjx 

VPHADDSW 1 

Vpix, Hpix, Wpix 

VPMADDUBSW 1 
Vpix, Hpkx, Wpkx 

VPHSUBW 1 
Vpix, Hpix, Wpix 

VPHSUBD 1 

Vpjx, Hpjx, Wpjx 

VPHSUBSW 1 
Vpix, Hpix, Wpix 

01 

lxh 




VCVTPH2PS 1 
Vpsx, Wphx 



VPERMPS 

Vps, Hd, Wps 

VPTEST 1 ' 4 

Vx, Wx 

01 

2xh 

VPMOVSXBW 1 
Vpix, Wpkx 

VPMOVSXBD 1 
Vpjx, Wpkx 

VPMOVSXBQ 1 
Vpqx, Wpkx 

VPMOVSXWD 1 
Vpjx, Wpix 

VPMOVSXWQ 1 
Vpqx, Wpix 

VPMOVSXDQ 1 
Vpqx, Wpjx 



01 

3xh 

VPMOVZXBW 1 
Vpix, Wpkx 

VPMOVZXBD 1 
Vpjx, Wpkx 

VPMOVZXBQ 1 
Vpqx, Wpkx 

VPMOVZXWD 1 
Vpjx, Wpix 

VPMOVZXWQ 1 
Vpqx, Wpix 

VPMOVZXDQ 1 
Vpqx, Wpjx 

VPERMD 

Vd, Hd, Wd 

VPCMPGTQ 1 
Vpqx, Hpqx, Wpqx 

01 

4xh 

VPMULLD 1 

Vpjx, Hpjx, Wpxj 

VPHMINPOSUW 

Vo, Wpi 




VPSRLV- 

D 1 Vx, Hx, Wx (W=0) 
q‘Vx, Hx, Wx (W=l) 

VPSRAVD 1 
Vpdwx, Hpdwx, 
Wpdwx 

VPSLLV- 

D 1 Vx, Hx, Wx (W=0) 
q'Vx.Hx, WxlW=l) 


5xh-8xh 









01 

9xh 

5 VPGATHERD- 
d‘Vx, M*d, Hpdw (W=0) 
Q 1 Vx, M*q, Hpqwx (W=l) 

5 VPGATHERQ- 
D^x, M*d, Hpdw (W=0) 
Q 1 Vx, M*q, Hpqw (W=l) 

5 VGATHERD- 

PS 1 Vx,M*ps,Hpsx (W=0) 
PD 1 Vx,M*pd,Hpdx |W=1) 

S VGATHERQ- 
PS 1 Vx,M*ps,Hps (W=0) 
PD 1 Vx,M*pd,Hpdx (W=l) 



2 VFMADDSUB132- 
PS 1 Vx,Hx,Wx (W=0) 
PD 1 Vx.Hx.Wx (W=l) 

3 VFMSUBADD132- 
PS‘Vx,Hx,Wx (W=0) 
PD 1 Vx,Hx,Wx (W=l) 

01 

Axh 







VFMADDSUB213- 
PS 1 Vx,Hx,Wx (W=0) 
PD 1 Vx.Hx.Wx (W=l) 

VFMSUBADD213- 

PS‘Vx,Hx,Wx (W=0) 
PD 1 Vx.Hx.Wx IW=1) 

01 

Bxh 







VFMADDSUB231- 

PS 1 Vx,Hx,Wx (W=0) 
PD 1 Vx.Hx.Wx (W=l) 

VFMSUBADD231- 

PS 1 Vx,Hx,Wx (W=0) 
PD 1 Vx.Hx.Wx IW=1) 


Cxh-Exh 









00 




ANDN 

Gy, By, Ey 



BZHI 

Gy, Ey, By 


BEXTR 

Gy, Ey, By 

01 

Fxh 




VEX group #17 


PEXT 

Gy, By, Ey 


SHLX 

Gy, Ey, By 

10 







SARX 

Gy, Ey, By 

11 







PDEP 

Gy, By, Ey 

MULX 

Gy, By, Ey 

SHRX 

Gy, Ey, By 


Note 1: Supports both 128 bit and 256 bit vector sizes. Vector size is specified using the VEX.L bit. When L = 0, size is 128 bits; when L = 1, size is 256 bits. 


Note 2: For all VFMADDSUBnnnPS instructions, the data type is packed single-precision floating point. 

For all VFMADDSUBnnnPD instructions, the data type is packed double-precision floating point. 
Note 3: For all VFMSUBADDnnnPS instructions, the data type is packed single-precision floating point. 

For all VFMSUBADDnnnPD instructions, the data type is packed double-precision floating point. 
Note 4: Operands are treated a bit vectors. 

Note 5: Uses VSIB addressing mode. _ 
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Table A-21. VEX Opcode Map 2, Low Nibble = [8h:Fh] 


VEX.pp 

Opcode 

x8h 

x9h 

xAh 

xBh 

xCh 

xDh 

xEh 

xFh 

01 

Oxh 

VPSIGNB 1 
Vpkx, Hpkx, Wpkx 

VPSIGNW 1 

Vpi, Hpi, Wpi 

VPSIGND 1 

Vpjx, Hpjx, Wpjx 

VPMULHRSW 1 
Vpix, Hpix, Wpix 

VPERMILPS 1 
Vpsx, Hpsx, Wpdwx 

VPERMILPD 1 
Vpdx, Hpdx, Wpqwx 

VTESTPS 1 

Vpsx, Wpsx 

VTESTPD 1 

Vpdx, Wpdx 

01 

lxh 

VBROADCASTSS 1 
Vps, Wss 

VBROADCASTSD 

Vpd, Wsd 
(VEX.L=1) 

VBROADCASTF128 

Vdo, Mo 
(VEX. L=l) 


VPABSB 1 

Vpkx, Wpkx 

VPAB5W 1 

Vpix, Wpix 

VPABSD 1 

Vpjx, Wpjx 


01 

2xh 

VPMULDQ 1 
Vpqx, Hpjx, Wpjx 

VPCMPEQQ 1 
Vpqx, Hpqx, Wpqx 

VMOVNTDQA 1 
Vx, Mx 

VPACKUSDW 1 
Vpix, Hpjx, Wpjx 

VMASKMOVPS 1 
Vpsx, Hx, Mpsx 

VMASKMOVPD 1 
Vpdx, Hx, Mpdx 

VMASKMOVPS 1 
Mpsx, Hx, Vpsx 

VMASKMOVPD 1 
Mpdx, Hx, Vpdx 

01 

3xh 

VPMINSB 1 
Vpkx, Hpkx, Wpkx 

VPMINSD 1 

Vpjx, Hpjx, Wpjx 

VPMINUW 1 
Vpix, Hpix, Wpix 

VPMINUD 1 

Vpjx, Hpjx, Wpjx 

VPMAXSB 1 
Vpkx, Hpkx, Wpkx 

VPMAXSD 1 

Vpxj, Hpjx, Wpjx 

VPMAXUW 1 
Vpix, Hpix, Wpix 

VPMAXUD 1 

Vpjx, Hpjx, Wpjx 

• • . 

4xh 

. . • 








01 

5xh 

VPBROADCASTD 1 
Vx, Wd 

VPBROADCASTQ 1 
Vx, Wq 

VBROADCASTI128 

Vdo, Mo 






• • • 

6xh 

• • • 








01 

7xh 

VPBROADCASTB 1 
Vx, Wb 

VPBROADCASTW 1 
Vx, Ww 







01 

8xh 





VPMASKMOV- 
D 1 Vx, Hx, Mx (W=0) 
Q'Vx.Hx.Mx (W=l) 


VPMASKMOV- 
d'Mx, Hx, Vx (W=0) 
Q 1 Mx. Hx. Vx (W=l) 


01 

9xh 

3 VFMADD132- 
PS 1 Vx,Hx,Wx (W=0) 
PD 1 Vx,Hx,Wx (W=l) 

VFMADD132- 

SS 2 Vo,Ho,Wd (W=0) 
SD 2 Vo,Ho,Wq(W=l) 

4 VFMSUB132- 
PS‘Vx,Hx,Wx (W=0) 
Pd‘Vx,Hx,Wx(W=1) 

VFMSUB132- 

SS 2 Vo,Ho,Wd (W=0) 
SD 2 Vo,Ho,Wq (W=l) 

VFNMADD132- 

PS‘Vx,Hx,Wx (W=0) 
PD 1 Vx,Hx,Wx (W=l) 

VFNMADD132- 

SS 2 Vo,Ho,Wd (W=0) 
SD 2 Vo, Ho,Wq (W=l) 

VFNMSUB132- 

PS 1 Vx,Hx,Wx (W=0) 
PD 1 Vx,Hx,Wx (W=l) 

VFNMSUB132- 

SS 2 Vo,Ho,Wd (W=0) 
SD 2 Vo,Ho,Wq (W=l) 

01 

Axh 

VFMADD213- 
PS 1 Vx,Hx,Wx (W=0) 
PD 1 Vx,Hx,Wx (W=l) 

VFMADD213- 

SS 2 Vo,Ho,Wd (W=0) 
SD 2 Vo,Ho,Wq (W=l) 

VFMSUB213- 

PS 1 Vx,Hx,Wx(W=0) 
PD 1 Vx,Hx,Wx (W=l) 

VFMSUB213- 

SS 2 Vo,Ho,Wd (W=0) 
SD 2 Vo,Ho,Wq(W=l) 

VFNMADD213- 

PS 1 Vx,Hx,Wx (W=0) 
PD 1 Vx,Hx,Wx (W=l) 

VFNMADD213- 

SS 2 Vo,Ho,Wd (W=0) 
SD 2 Vo,Ho,Wq(W=l) 

VFNMSUB213- 
PS 1 Vx,Hx,Wx (W=0) 
PD‘Vx,Hx,Wx (W=l) 

VFNMSUB213- 

SS 2 Vo,Ho,Wd (W=0) 
SD 2 Vo,Ho,Wq (W=l) 

01 

Bxh 

VFMADD231- 
PS^XjHx.Wx (W=0) 
PD 1 Vx,Hx,Wx (W=l) 

VFMADD231- 
SS 2 Vo,Ho,Wd (W=0) 
SD 2 Vo,Ho,Wq (W=l) 

VFMSUB231- 
PS‘Vx,Hx,Wx (W=0) 
PD 1 Vx,Hx,Wx(W=l) 

VFMSUB231- 

SS 2 Vo,Ho,Wd(W=0) 

SD 2 Vo,Ho,Wq(W=l) 

VFNMADD231- 
PS 1 Vx,Hx,Wx (W=0) 
PD‘Vx,Hx,Wx (W=l) 

VFNMADD231- 
SS 2 Vo,Ho,Wd (W=0) 
SD 2 Vo,Ho,Wq(W=l) 

VFNMSUB231- 
PS 1 Vx,Hx,Wx (W=0) 
PD‘Vx,Hx,Wx (W=l) 

VFNMSUB231- 
SS 2 Vo,Ho,Wd (W=0) 
SD 2 Vo,Ho,Wq(W=l) 

• . . 

Cxh 

. . • 








01 

Dxh 




VAESIMC 

Vo, Wo 

VAESENC 

Vo, Ho, Wo 

VAESENCLAST 

Vo, Ho, Wo 

VAESDEC 

Vo, Ho, Wo 

VAESDECLAST 

Vo, Ho, Wo 


Exh-Fxh 










Note 1: Supports both 128 bit and 256 bit vector sizes. Vector size is specified using the VEX.L bit. When L = 0, size is 128 bits; when L = 1, size is 256 bits. 

Note 2: Operands are scalars. VEX! bit is ignored. 

Note 3: For all VFMADDnnnPS instructions, the data type is packed single-precision floating point. 

For all VFMADDnnnPD instructions, the data type is packed double-precision floating point. 

Note 4: For all VFMSUBnnnPS instructions, the data type is packed single-precision floating point. 

For all VFMSUBnnnPD instructions, the data type is packed double-precision floating point. 
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Table A-22. VEX Opcode Map 3, Low Nibble = [0h:7h] 


VEX.pp 

Nibble 

xOh 

xlh 

x2h 

x3h 

x4h 

x5h 

x6h 

x7h 

00 

Oxh 









01 

VPERMQ 

Vq, Wq, lb 

VPERMPD 

Vpd, Wpd, Ib 

VPBLENDD 1 
Vpdwx, Hpdwx, 
Wpdwx, Ib 


VPERMILPS 1 
Vpsx, Wpsx, Ib 

VPERMILPD 1 
Vpdx, Wpdx, Ib 

VPERM2F128 

Vdo, Ho, Wo, Ib 
(VEX.L=1) 


00 

lxh 









01 





VPEXTRB 

Mb, Vpb, Ib 

VPEXTRB 

Ry, Vpb, Ib 

VPEXTRW 

Mw, Vpw, Ib 

VPEXTRW 

Ry, Vpw, Ib 

VPEXTRD 

Ed, Vpdw, Ib 
VPEXTRQ 

Eq, Vpqw, Ib 

VEXTRACTPS 

Mss, Vps, Ib 

VEXTRACTPS 

Rss, Vps, Ib 

00 

2xh 









01 

VPINSRB 

Vpb, Hpb, Wb, lb 

VINSERTPS 

Vps, Hps, Ups/Md, 

VPINSRD 

Vpdw, Hpdw, Ed, Ib 
(W=0) 
VPINSRQ 

Vpdw, Hpqw, Eq, Ib 
(W=l) 






.. . 

Bxh 

. . . 








00 

4xh 









01 

VDPPS 1 

Vpsx, Hpsx, Wpsx, 

lb 

VDPPD 

Vpd, Hpd, Wpd, Ib 

VMPSADBW 1 
Vpix, Hpkx, Wpkx, 

Ib 


VPCLMULQDQ 

Vo, Hpq, Wpq, Ib 


VPERM2I128 

Vo, Ho, Wo, ib 



5xh 









00 

6xh 









01 

VPCMPESTRM 

Vo, Wo, lb 

VPCMPESTRI 

Vo, Wo, Ib 

VPCMPISTRM 

Vo, Wo, Ib 

VPCMPISTRI 

Vo, Wo, Ib 






7xh-Exh 









10 

Fxh 









11 

RORX 

Gy, Ey, ib 









Note 1: Supports both 128 bit and 256 bit vector sizes. Vector size is specified using the VEX.L bit. When L=0, size is 128 bits; when L=1, size is 256 bits. 
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Table A-23. VEX Opcode Map 3, Low Nibble = [8h:Fh] 


VEX.pp 

Opcode 

x8h 

x9h 

xAh 

xBh 

xCh 

xDh 

xEh 

xFh 

01 

Oxh 

VROUNDPS 1 
Vpsx, Wpsx, lb 

VROUNDPD 1 
Vpdx, Wpdx, lb 

VROUNDSS 

Vss, Hss, Wss, lb 

VROUNDSD 

Vsd, Hsd, Wsd, lb 

VBLENDPS 1 
Vpsx, Hpsx, Wpsx, 

lb 

VBLENDPD 1 
Vpdx, Hpdx, Wpdx, 
lb 

VPBLENDW 1 
Vpwx, Hpwx, Wpwx, 

lb 

VPALIGNR 1 
Vpbx, Hpbx, Wpbx, 

lb 

01 

lxh 

VINSERTF128 

Vdo, Hdo, Wo, lb 

VEXTRACTF128 

Wo, Vdo, lb 




VCVTPS2PH 1 
Wph, Vps, lb 




2xh 

. . . 








01 

3xh 

VINSERTI128 

Vdo, Hdo, Wo, lb 

VEXTRACTI128 

Wo, Vdo, lb 







01 

4xh 

VPERMILzz2PS u 
Vpsx, Hpsx, Wpsx, 
Lpsx, lb (W=0) 
Vpsx, Hpsx, Lpsx, 
Wpsx, lb (W=l) 

VPERMILzz2PD 1,2 
Vpdx, Hpdx, Wpdx 
Lpdx, lb (W=0) 
Vpdx, Hpdx, Lpdx, 
Wpdx, lb |W=1) 

VBLENDVPS 1 
Vpsx, Hpsx, Wpsx, 
Lpdx 

VBLENDVPD 1 
Vpdx, Hpdx, Wpdx, 
Lpdx 

VPBLENDVB 1 
Vpbx, Hpbx, Wpbx, 

Lx 




01 

5xh 





VFMADDSUBPS 1 
Vpsx, Lpsx, Wpsx, 
Hpsx (W=0) 
Vpsx, Lpsx, Hpsx, 
Wpsx (W=l) 

VFMADDSUBPD 1 
Vpdx, Lpdx, Wpdx, 
Hpdx (W=0) 
Vpdx, Lpdx, Hpdx, 
Wpdx (W=l) 

VFMSUBADDPS 1 
Vpsx, Lpsx, Wpsx, 
Hpsx (W=0) 
Vpsx, Lpsx, Hpsx, 
Wpsx (W=l) 

VFMSUBADDPD 1 
Vpdx, Lpdx, Wpdx, 
Hpdx (W=0) 
Vpdx, Lpdx, Hpdx, 
Wpdx (W=l) 

01 

6xh 

VFMADDPS 1 
Vpsx, Lpsx, Wpsx, 
Hpsx (W=0) 
Vpsx, Lpsx, Hpsx, 
Wpsx (W=l) 

VFMADDPD 1 
Vpdx, Lpdx, Wpdx, 
Hpdx (W=0) 
Vpdx, Lpdx, Hpdx, 
Wpdx (W=l) 

VFMADDSS 

Vss, Lss, Wss, Hss 
(W=0) 

Vss, Lss, Hss, Wss 
(W=l) 

VFMADDSD 

Vsd, Lsd, Wsd, Hsd 
(W=0) 

Vsd, Lsd, Hsd, Wsd 
(W=l) 

VFMSUBPS 1 
Vpsx, Lpsx, Wpsx, 
Hpsx (W=0) 
Vpsx, Lpsx, Hpsx, 
Wpsx (W=l) 

VFMSUBPD 1 
Vpdx, Lpdx, Wpdx, 
Hpdx (W=0) 
Vpdx, Lpdx, Hpdx, 
Wpdx (W=l) 

VFMSUBSS 

Vss, Lss, Wss, Hss 
(W=0) 

Vss, Lss, Hss, Wss 
(W=l) 

VFMSUBSD 

Vsd, Lsd, Wsd, Hsd 
(W=0) 

Vsd, Lsd, Hsd, Wsd 
(W=l) 

01 

7xh 

VFNMADDPS 1 
Vpsx, Lpsx, Wpsx, 
Hpsx (W=0) 
Vpsx, Lpsx, Hpsx, 
Wpsx (W=l) 

VFNMADDPD 1 
Vpdx, Lpdx, Wpdx, 
Hpdx (W=0) 
Vpdx, Lpdx, Hpdx, 
Wpdx (W=l) 

VFNMADDSS 

Vss, Lss, Wss, Hss 
(W=0) 

Vss, Lss, Hss, Wss 
(W=l) 

VFNMADDSD 

Vsd, Lsd, Wsd, Hsd 
(W=0) 

Vsd, Lsd, Hsd, Wsd 
(W=l) 

VFNMSUBPS 1 
Vpsx, Lpsx, Wpsx, 
Hpsx (W=0) 
Vpsx, Lpsx, Hpsx, 
Wpsx (W=l) 

VFNMSUBPD 1 
Vpdx, Lpdx, Wpdx, 
Hpdx (W=0) 
Vpdx, Lpdx, Hpdx, 
Wpdx (W=l) 

VFNMSUBSS 

Vss, Lss, Wss, Hss 
(W=0) 

Vss, Lss, Hss, Wss 
(W=l) 

VFNMSUBSD 

Vsd, Lsd, Wsd, Hsd 
(W=0) 

Vsd, Lsd, Hsd, Wsd 
(W=l) 

. .. 

8xh-Cxh 

. . . 








01 

Dxh 








VAESKEYGEN- 

ASSIST 

Vo, Wo, lb 


Exh-Fxh 










Note 1: Supports both 128 bit and 256 bit vector sizes. Vector size is specified using the VEX.L bit. When L=0, size is 128 bits; when L=1, size is 256 bits. 

Note 2: The zero match codes are TD, TD (alias), MO, and MZ. They are encoded as the zzzz field of the lb, using 0...3h. 
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Table A-24. VEX Opcode Groups 


Group 


ModRM Byte 

Number 

VEX Map, 
Opcode 

VEX.pp 

xxOOOxxx 

xxOOlxxx 

xxOlOxxx 

xxOllxxx 

xxlOOxxx 

xxlOlxxx 

xxllOxxx 

xxlllxxx 


1 




VPSRLW 1 


VPSRAW 1 


VPSLLW 1 


12 

71h 

01 



Hpwx, Upwx, lb 


Hpwx, Upwx, lb 


Hpwx, Upwx, lb 



1 




VPSRLD 1 


VPSRAD 1 


VPSLLD 1 


13 

72h 

01 



Hpdwx, Updwx, lb 


Hpdwx, Updwx, lb 


Hpdwx, Updwx, lb 



1 




VPSRLQ 1 

VPSRLDQ 1 



VPSLLQ 1 

VPSLLDQ 1 

14 

73h 

01 



Hpqwx, Upqwx, lb 

Hpbx, Upbx, lb 



Hpqwx, Upqwx, lb 

Hpbx, Upbx, lb 

15 

1 

AEh 

00 



VLDMXCSR Md 

VSTMXCSR Md 






2 



BLSR 

BLSMSK 

BLSI 





17 

F3h 

00 


By, Ey 

By, Ey 

By, Ey 






Note: 1. Supports both 128 bit and 256 bit vector sizes. Vector size is specified using the VEX.Lbit. When L = 0, size is 128 bits; when L= 1, size is 256 bits. 


XOP Opcode Maps. Tables A-25 - A-30 below present the XOP opcode maps and Table A-31 on 
page 495 presents the VEX opcode groups. 


Table A-25. XOP Opcode Map 8h, Low Nibble = [0h:7h] 


XOP.pp 

Opcode 

xOh 

xlh 

x2h 

x3h 

x4h 

x5h 

x6h 

x7h 


0xh-7xh 
















VPMACSSWW 

VPMACSSWD 

VPMACSSDQL 

00 

8xh 






Vo, Ho, Wo, Lo 

Vo,Ho,Wo,Lo 

Vo,Ho,Wo,Lo 








VP MACS WW 

VPMACSWD 

VPMACSDQL 

00 

9xh 






Vo, Ho, Wo, Lo 

Vo,Ho,Wo,Lo 

Vo,Ho,Wo,Lo 





VPCMOV 

VPPERM 



VPMADCSSWD 


00 

Axh 



Vx,Hx,Wx,Lx (W=0) 

Vo,Ho,Wo,Lo (W=0) 



Vo, Ho, Wo, Lo 






Vx,Hx,Lx,Wx (W=l) 

Vo,Ho,Lo,Wo (W=l) 













VPMADCSWD 


00 

Bxh 







Vo,Ho,Wo,Lo 




VPROTB 

VPROTW 

VPROTD 

VPROTQ 





00 

Cxh 

Vo, Wo, lb 

Vo, Wo, lb 

Vo, Wo, lb 

Vo,Wo,lb 






Dxh-Fxh 
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Table A-26. XOP Opcode Map 8h, Low Nibble = [8h:Fh] 


XOP.pp 

Opcode 

x8h 

x9h 

xAh 

xBh 

xCh 

xDh 

xEh 

xF 


0xh-07xh 

















VPMACSSDD 

VPMACSSDQH 

00 

8xh 







Vo,Ho,Wo,Lo 

Vo, Ho, Wo, Lo 









VPMACSDD 

VPMACSDQH 

00 

9xh 







Vo,Ho,Wo,Lo 

Vo,Ho,Wo,Lo 


Axh-Bxh 















VPCOMccB 1 

VPCOMccW 1 

VPCOMccD 1 

VPCOMccQ 1 

00 

Cxh 





Vo,Ho,Wo,lb 

Vo,Ho,Wo,lb 

Vo, Ho, Wo, lb 

Vo,Ho,Wo,lb 

00 

Dxh 















VPCOMccUB 1 

VPCOMcclIW 1 

VPCOMccUD 1 

VPCOMccUQ 1 

00 

Exh 





Vo,Ho,Wo,lb 

Vo,Ho,Wo,lb 

Vo, Ho, Wo, lb 

Vo,Ho,Wo,lb 

00 

Fxh 










Note 1: 

The condition codes are LT, LE, GT, GE, EQ, NEQ, FALSE, and TRUE. They are encoded via lb, using 00...07h. 




Table A-27. XOP Opcode Map 9h, Low Nibble = [0h:7h] 


XOP.pp 

Opcode 

xOh 

xlh 

x2h 

x3h 

x4h 

x5h 

x6h 

x7h 

00 

Oxh 


XOP group #1 

XOP group #2 






00 

lxh 



XOP group #3 







2xh-7xh 











VFRCZPS 

VFRCZPD 

VFRCZSS 

VFRCZSD 





00 

8xh 

Vx,Wx 

Vx,Wx 

Vq,Wss 

Vq,Wsd 







VPROTB 

VPROTW 

VPROTD 

VPROTQ 

VPSHLB 

VPSHLW 

VPSHLD 

VPSHLQ 

00 

9xh 

Vo,Wo,Ho (W=0) 

Vo,Wo,Ho (W=0) 

Vo,Wo, Ho (W=0) 

Vo,Wo,Ho (W=0) 

Vo,Wo, Ho (W=0) 

Vo,Wo, Ho (W=0) 

Vo,Wo,Ho (W=0) 

Vo,Wo, Ho (W=0) 



Vo, Ho,Wo (W=l) 

Vo,Ho,Wo (W=l) 

Vo,Ho,Wo (W=l) 

Vo,Ho,Wo (W=l) 

Vo, Ho, Wo (W=l) 

Vo,Ho,Wo (W=l) 

Vo,Ho,Wo (W=l) 

Vo, Ho,Wo (W=l) 


Axh-Bxh 












VPHADDBW 

VPHADDBD 

VPHADDBQ 



VPHADDWD 

VPHADDWQ 

00 

Cxh 


Vo,Wo 

Vo, Wo 

Vo,Wo 



Vo,Wo 

Vo, Wo 




VPHADDUBWD 

VPHADDUBD 

VPHADDUBQ 



VPHADDUWD 

VPHADDUWQ 

00 

Dxh 


Vo,Wo 

Vo, Wo 

Vo,Wo 



Vo,Wo 

Vo,Wo 




VPHSUBBW 

VPHSUBWD 

VPHSUBDQ 





00 

Exh 


Vo,Wo 

Vo, Wo 

Vo,Wo 






Fxh 
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Table A-28. XOP Opcode Map 9h, Low Nibble = [8h:Fh] 


XOP.pp 

Opcode 

x8h 

x9h 

xAh 

xBh 

xCh 

xDh 

xEh 

xF 


0xh-8xh 









00 

9xh 

VPSHAB 

Vo, Wo, Ho (W=0) 
Vo, Ho, Wo (W=l) 

VPSHAW 

Vo, Wo, Ho (W=0) 
Vo, Ho, Wo (W=l) 

VPSHAD 

Vo,Wo, Ho (W=0) 
Vo, Ho,Wo (W=l) 

VPSHAQ 

Vo,Wo,Ho (W=0) 
Vo,Ho,Wo (W=l) 






Axh-Bxh 









00 

Cxh 




VPHADDDQ 

Vo,Wo 





00 

Dxh 




VPHADDUDQ 

Vo,Wo 






Exh-Fxh 










Table A-29. XOP Opcode Map Ah, Low Nibble = [0h:7h] 


XOP.pp 

Opcode 

xOh 

xlh 

x2h 

x3h 

x4h 

x5h 

x6h 

x7h 


Oxh 









00 

lxh 

BEXTR 

Gy,Ey,ld 


XOP group #4 







2xh-Fxh 










Table A-30. XOP Opcode Map Ah, Low Nibble = [8h:Fh] 


XOP.pp 

Opcode 

x8h 

x9h 

xAh 

xBh 

xCh 

xDh 

xEh 

xFh 

n/a 

Oxh-Fxh 









Opcodes Reserved 


Table A-31. XOP Opcode Groups 



ModRM.reg 

Grou 

P 

/O 

/i 

P 

/3 

/4 

/5 

/6 

n 

XOP 

9 

Olh 

#1 


BLCFILL 

By,Ey 

BLSFILL 

By,Ey 

BLCS 

By,Ey 

TZMSK 

By,Ey 

BLCIC 

By,Ey 

BLSIC 

By,Ey 

T1MSKC 

By,Ey 

XOP 

9 

02h 

#2 


BLCMSK 

By,Ey 





BLCI 

By,Ey 


XOP 

9 

12h 

#3 

LLWPCB 

Ry 

SLWPCB 

Ry 







XOP 

A 

12h 

#4 

LWPINS 

By,Ed, Id 

LWPVAL 

By, Ed, Id 
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A.2 Operand Encodings 

An operand is data that affects or is affected by the execution of an instruction. Operands may be 
located in registers, memory, or I/O ports. For some instructions, the location of one or more operands 
is implicitly specified based on the opcode alone. However, for most instructions, operands are 
specified using bytes that immediately follow the opcode byte. These bytes are designated the mode- 
register-memory (ModRM) byte, the scale-index-base (SIB) byte, the displacement byte(s), and the 
immediate byte(s). The presence of the SIB, displacement, and immediate bytes are optional 
depending on the instruction, and, for instructions that reference memory, the memory addressing 
mode. 

The following sections describe the encoding of the ModRM and SIB bytes in various processor 
modes. 

A.2.1 ModRM Operand References 

Figure A-2 below shows the format of the ModRM byte. There are three fields— mod, reg, and r/m. 
The reg field is normally used to specify a register-based operand. The mod and r/m fields together 
provide a 5-bit field, augmented in 64-bit mode by the R and B bits of a REX, VEX, or XOP prefix, 
normally used to specify the location of a second memory- or register-based operand and, for a 
memory-based operand, the addressing mode. 

As described in “Encoding Extensions Using the ModRM Byte” on page 465, certain instructions use 
either the reg field, the r/m field, or the entire ModRM byte to extend the opcode byte in the encoding 
of the instruction operation. 


mod 


REX.R, VEX.R or XORR 
extend this field to 4 bits 


reg 


r/m 


ModRM 


REX.B, VEX.B, orXOP.B- 

extend this field to 4 bits v3_ModRM_format.eps 


Figure A-2. ModRM-Byte Format 

The two sections below describe the ModRM operand encodings, first for 16-bit references and then 
for 32-bit and 64-bit references. 

16-Bit Register and Memory References. Table A-32 shows the notation and encoding 
conventions for register references using the ModRM reg field. This table is comparable to Table A-34 
on page 499 but applies only when the address-size is 16-bit. Table A-33 on page 497 shows the 
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notation and encoding conventions for 16-bit memory references using the ModRM byte. This table is 
comparable to Table A-35 on page 500. 


Table A-32. ModRM reg Field Encoding, 16-Bit Addressing 


Mnemonic 

ModRM reg Field 

Notation 

10 

/I 

12 

/ 3 

14 

15 

16 

17 

reg 8 

AL 

CL 

DL 

BL 

AH 

CH 

DH 

BH 

reg 16 

AX 

CX 

DX 

BX 

SP 

BP 

SI 

Dl 

reg 32 

EAX 

ECX 

EDX 

EBX 

ESP 

EBP 

ESI 

EDI 

mmx 

MMXO 

MMX1 

MMX2 

MMX3 

MMX4 

MMX5 

MMX6 

MMX7 

xmm 

XMMO 

XMM1 

XMM2 

XMM3 

XMM4 

XMM5 

XMM6 

XMM7 

ymm 

YMMO 

YMM1 

YMM2 

YMM3 

YMM4 

YMM5 

YMM6 

YMM7 

sReg 

ES 

CS 

SS 

DS 

FS 

GS 

invalid 

invalid 

cReg 

CRO 

CR1 

CR2 

CR3 

CR4 

CR5 

CR6 

CR7 

dReg 

DRO 

DR1 

DR2 

DR3 

DR4 

DR5 

DR6 

DR7 


Table A-33. ModRM Byte Encoding, 16-Bit Addressing 


Effective Address 

ModRM 

mod 

Field 

(binary) 

ModRM reg Field 1 

ModRM 

r/m 

Field 

(binary) 

10 

/I 

12 

13 

14 

15 

16 

17 

Complete ModRM Byte (hex) 

[BX] + [SI] 

00 

00 

08 

10 

18 

20 

28 

30 

38 

000 

[BX] + [Dl] 

01 

09 

11 

19 

21 

29 

31 

39 

001 

cb 

+ 

5T 

CD 

02 

0A 

12 

1A 

22 

2A 

32 

3A 

010 

[BP] + [Dl] 

03 

0B 

13 

IB 

23 

2B 

33 

3B 

Oil 

[SI] 

04 

OC 

14 

1C 

24 

2C 

34 

3C 

100 

[Dl] 

05 

0D 

15 

ID 

25 

2D 

35 

3D 

101 

disp16 

06 

0E 

16 

IE 

26 

2E 

36 

3E 

110 

[BX] 

07 

OF 

17 

IF 

27 

2F 

37 

3F 

111 


Notes: 

1. See Table A-32 for complete specification of ModRM “reg” field. 
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Table A-33. ModRM Byte Encoding, 16-Bit Addressing (continued) 


Effective Address 

ModRM 

mod 

Field 

(binary) 

ModRM reg Field 1 

ModRM 

r/m 

Field 

(binary) 

10 

/ 1 

12 

/3 

14 

15 

16 

17 

Complete ModRM Byte (hex) 

[BX] + [SI] + disp8 

01 

40 

48 

50 

58 

60 

68 

70 

78 

000 

[BX] + [Dl] + disp8 

41 

49 

51 

59 

61 

69 

71 

79 

001 

[BP] + [SI] + disp8 

42 

4A 

52 

5A 

62 

6A 

72 

7 A 

010 

[BP] + [Dl] + disp8 

43 

4B 

53 

5B 

63 

6B 

73 

7B 

Oil 

[SI] + disp8 

44 

4C 

54 

5C 

64 

6C 

74 

7C 

100 

[Dl] + disp8 

45 

4D 

55 

5D 

65 

6D 

75 

7D 

101 

[BP] + disp8 

46 

4E 

56 

5E 

66 

6E 

76 

7E 

110 

[BX] + disp8 

47 

4F 

57 

5F 

67 

6F 

77 

7F 

111 

[BX] + [SI] + disp16 

10 

80 

88 

90 

98 

A0 

A8 

B0 

B8 

000 

[BX] + [Dl] + disp16 

81 

89 

91 

99 

A1 

A9 

B1 

B9 

001 

[BP] + [SI] + disp16 

82 

8A 

92 

9A 

A2 

AA 

B2 

BA 

010 

[BP] + [Dl] + disp16 

83 

8B 

93 

9B 

A3 

AB 

B3 

BB 

Oil 

[SI] + disp16 

84 

8C 

94 

9C 

A4 

AC 

B4 

BC 

100 

[Dl] + disp16 

85 

8D 

95 

9D 

A5 

AD 

B5 

BD 

101 

[BP] + disp16 

86 

8E 

96 

9E 

A6 

AE 

B6 

BE 

110 

[BX] + disp16 

87 

8F 

97 

9F 

A7 

AF 

B7 

BF 

111 

AL/AX/ EAX/ MMXO/ XMMO/ YMMO 

11 

CO 

C8 

DO 

D8 

E0 

E8 

F0 

F8 

000 

CL/ CXI EC XI MMX1/ XMM1/ YMM1 

Cl 

C9 

Dl 

D9 

El 

E9 

FI 

F9 

001 

DL/ D XJ ED XJ MMX2/ XMM2/ YMM2 

C2 

CA 

D2 

DA 

E2 

EA 

F2 

FA 

010 

BL / BX/ EBX/ MMX3/ XMM3/ YMM3 

C3 

CB 

D3 

DB 

E3 

EB 

F3 

FB 

Oil 

AH/ SP / ESP / MMX4/ XMM4/ YMM4 

C4 

CC 

D4 

DC 

E4 

EC 

F4 

FC 

100 

CHI BP/ EBP/ MMX5/ XMM5/ YMM5 

C5 

CD 

D5 

DD 

E5 

ED 

F5 

FD 

101 

DH/ SI/ ESI / MMX6/ XMM6/ YMM6 

C6 

CE 

D6 

DE 

E6 

EE 

F6 

FE 

110 

BH / Dl/ EDI/ MMX7/ XMM7/ YMM7 

C7 

CF 

D7 

DF 

E7 

EF 

F7 

FF 

111 


Notes: 


1. See Table A-32 for complete specification of ModRM “reg” field. 


Register and Memory References for 32-Bit and 64-Bit Addressing. Table A-34 on 
page 499 shows the encoding for register references using the ModRM reg field. The first ten rows of 
Table A-34 show references when the REX.R bit is cleared to 0, and the last ten rows show references 
when the REX.R bit is set to 1. In this table, entries under the Mnemonic Notation heading correspond 
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to register notation described in “Mnemonic Syntax” on page 52, and the /r notation under the ModRM 
reg Field heading corresponds to that described in “Opcode Syntax” on page 55. 


Table A-34. ModRM reg Field Encoding, 32-Bit and 64-Bit Addressing 


Mnemonic 

REX.R Bit 

ModRM reg Field 

Notation 

10 

/ 1 

12 

13 

14 

15 

16 

17 

reg 8 


AL 

CL 

DL 

BL 

AH/SPL 

CH/BPL 

DH/SIL 

BH/DIL 

reg 16 


AX 

CX 

DX 

BX 

SP 

BP 

SI 

Dl 

reg32 


EAX 

ECX 

EDX 

EBX 

ESP 

EBP 

ESI 

EDI 

reg64 


RAX 

RCX 

RDX 

RBX 

RSP 

RBP 

RSI 

RDI 

mmx 

0 

MMX0 

MMX1 

MMX2 

MMX3 

MMX4 

MMX5 

MMX6 

MMX7 

xmm 

XMM0 

XMM1 

XMM2 

XMM3 

XMM4 

XMM5 

XMM6 

XMM7 

ymm 


YMM0 

YMM1 

YMM2 

YMM3 

YMM4 

YMM5 

YMM6 

YMM7 

sReg 


ES 

CS 

SS 

DS 

FS 

GS 

invalid 

invalid 

cReg 


CR0 

CR1 

CR2 

CR3 

CR4 

CR5 

CR6 

CR7 

dReg 


DR0 

DR1 

DR2 

DR3 

DR4 

DR5 

DR6 

DR7 

reg8 


R8B 

R9B 

R10B 

R11B 

R12B 

R13B 

R14B 

R15B 

reg 16 


R8W 

R9W 

R10W 

R11W 

R12W 

R13W 

R14W 

R15W 

reg32 


R8D 

R9D 

R10D 

R11D 

R12D 

R13D 

R14D 

R15D 

reg64 


R8 

R9 

R10 

R11 

R12 

R13 

R14 

R15 

mmx 

1 

MMX0 

MMX1 

MMX2 

MMX3 

MMX4 

MMX5 

MMX6 

MMX7 

xmm 

XMM8 

XMM9 

XMM10 

XMM11 

XMM12 

XMM13 

XMM14 

XMM15 

ymm 


YMM8 

YMM9 

YMM10 

YMM11 

YMM12 

YMM13 

YMM14 

YMM15 

sReg 


ES 

CS 

SS 

DS 

FS 

GS 

invalid 

invalid 

cReg 


CR8 

CR9 

CR10 

CR11 

CR12 

CR13 

CR14 

CR15 

dReg 


DR8 

DR9 

DR10 

DR11 

DR12 

DR13 

DR14 

DR15 


Table A-35 on page 500 shows the encoding for 32-bit and 64-bit memory references using the 
ModRM byte. This table describes 32-bit and 64-bit addressing, with the REX.B bit set or cleared. The 
Effective Address is shown in the two left-most columns, followed by the binary encoding of the 
ModRM-byte mod field, followed by the eight possible hex values of the complete ModRM byte (one 
value for each binary encoding of the ModRM-byte reg field), followed by the binary encoding of the 
ModRM r/m field. 

The /0 through /7 notation for the ModRM reg field (bits [5:3]) means that the three-bit field contains a 
value from zero (binary 000) to 7 (binary 111). 
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Table A-35. ModRM Byte Encoding, 32-Bit and 64-Bit Addressing 


Effective Address 

ModRM 

mod 

Field 

(binary) 

ModRM reg Field 1 

ModRM 

r/m 

Field 

(binary) 

10 

/I 

12 

13 

14 

15 

16 

17 

REX.B = 0 

REX.B = 1 

Complete ModRM Byte (hex) 

[rAX] 

[1-8] 

00 

00 

08 

10 

18 

20 

28 

30 

38 

000 

[rCX] 

[r9] 

01 

09 

11 

19 

21 

29 

31 

39 

001 

[rDX] 

[rIO] 

02 

0A 

12 

1A 

22 

2A 

32 

3A 

010 

[rBX] 

[rll] 

03 

0B 

13 

IB 

23 

2B 

33 

3B 

Oil 

SIB 2 

SIB 2 

04 

OC 

14 

1C 

24 

2C 

34 

3C 

100 

[rIP] + disp32 or 
disp32 3 

[rIP] + disp32 or 
disp32 3 

05 

0D 

15 

ID 

25 

2D 

35 

3D 

101 

[rSI] 

[r14] 

06 

0E 

16 

IE 

26 

2E 

36 

3E 

110 

[rDI] 

[r15] 

07 

OF 

17 

IF 

27 

2F 

37 

3F 

111 

[rAX] + disp8 

[r8] + disp8 

01 

40 

48 

50 

58 

60 

68 

70 

78 

000 

[rCX] + disp8 

[r9] + disp8 

41 

49 

51 

59 

61 

69 

71 

79 

001 

[rDX] + disp8 

[rl 0] + disp8 

42 

4A 

52 

5A 

62 

6A 

72 

7 A 

010 

[rBX] + disp8 

[rll] + disp8 

43 

4B 

53 

5B 

63 

6B 

73 

7B 

Oil 

[SIB] + disp8 

[SIB] + disp8 

44 

4C 

54 

5C 

64 

6C 

74 

7C 

100 

[rBP] + disp8 

[rl 3] + disp8 

45 

4D 

55 

5D 

65 

6D 

75 

7D 

101 

[rSI] + disp8 

[rl 4] + disp8 

46 

4E 

56 

5E 

66 

6E 

76 

7E 

110 

[rDI] + disp8 

[rl 5] + disp8 

47 

4F 

57 

5F 

67 

6F 

77 

7F 

111 

[rAX] + disp32 

[r8] + disp32 

10 

80 

88 

90 

98 

A0 

A8 

B0 

B8 

000 

[rCX] + disp32 

[r9] + disp32 

81 

89 

91 

99 

A1 

A9 

B1 

B9 

001 

[rDX] + disp32 

[rIO] + disp32 

82 

8A 

92 

9A 

A2 

AA 

B2 

BA 

010 

[rBX] + disp32 

[rll] + disp32 

83 

8B 

93 

9B 

A3 

AB 

B3 

BB 

Oil 

SIB + disp32 

SIB + disp32 

84 

8C 

94 

9C 

A4 

AC 

B4 

BC 

100 

[rBP] + disp32 

[rl 3] + disp32 

85 

8D 

95 

9D 

A5 

AD 

B5 

BD 

101 

[rSI] + disp32 

[rl 4] + disp32 

86 

8E 

96 

9E 

A6 

AE 

B6 

BE 

110 

[rDI] + disp32 

[rl 5 ] + disp32 

87 

8F 

97 

9F 

A7 

AF 

B7 

BF 

111 


Notes: 

1. See Table A-34 for complete specification of ModRM “reg” field. 

2. If SIB. base = 5, the SIB byte is followed by four-byte disp32 field and addressing mode is absolute. 

3. In 64-bit mode, the effective address is [rlP]+disp32. In all other modes, the effective address is disp32. If the 
address-size prefix is used in 64-bit mode to override 64-bit addressing, the [RIP]+disp32 effective address is trun¬ 
cated after computation to 32 bits. 
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Table A-35. ModRM Byte Encoding, 32-Bit and 64-Bit Addressing (continued) 


Effective Address 

ModRM 

mod 

Field 

(binary) 

ModRM reg Field 1 

ModRM 

r/m 

Field 

(binary) 

10 

/I 

12 

13 

14 

15 

16 

17 

REX.B = 0 

REX.B = 1 

Complete ModRM Byte (hex) 

AL/rAX/M MXO/XM M 0/ 
YMMO 

r8/M MXO/XM M8/ 
YMM8 

11 

CO 

C8 

DO 

D8 

E0 

E8 

F0 

F8 

000 

CL/rCX/MMX1/XMM1/ 

YMM1 

r9/MMX1/XMM9/ 

YMM9 

Cl 

C9 

D1 

D9 

El 

E9 

FI 

F9 

001 

DL/rDX/MMX2/XMM2/ 

YMM2 

M0/MMX2/XMM10/ 

YMM10 

C2 

CA 

D2 

DA 

E2 

EA 

F2 

FA 

010 

BL/rBX/MMX3/XMM3/ 

YMM3 

r11/MMX3/XMM11/ 

YMM11 

C3 

CB 

D3 

DB 

E3 

EB 

F3 

FB 

Oil 

AH/S PL/rS P/M MX4/ 
XMM4/YMM4 

M2/MMX4/XMM12/ 

YMM12 

C4 

CC 

D4 

DC 

E4 

EC 

F4 

FC 

100 

CH/BPL/rBP/MMX5/ 

XMM5/YMM5 

M3/MMX5/XMM13/ 

YMM13 

C5 

CD 

D5 

DD 

E5 

ED 

F5 

FD 

101 

DH/S1 L/rS I/M MX6/ 
XMM6/YMM6 

M4/MMX6/XMM14/ 

YMM14 

C6 

CE 

D6 

DE 

E6 

EE 

F6 

FE 

110 

BH/DIL/rDI/MMX7/ 

XMM7/YMM7 

r15/MMX7/XMM15/ 

YMM15 

C7 

CF 

D7 

DF 

E7 

EF 

F7 

FF 

111 


Notes: 

1. See Table A-34 for complete specification of ModRM “reg” field. 

2. If SIB.base = 5, the SIB byte is followed by four-byte disp32 field and addressing mode is absolute. 

3. In 64-bit mode , the effective address is [rlP]+disp32. In all other modes, the effective address is disp32. If the 
address-size prefix is used in 64-bit mode to override 64-bit addressing, the [RIP]+disp32 effective address is trun¬ 
cated after computation to 32 bits. 


A.2.2 SIB Operand References 

Figure A-3 on page 502 shows the format of a scale-index-base (SIB) byte. Some instructions have an 
SIB byte following their ModRM byte to define memory addressing for the complex-addressing 
modes described in “Effective Addresses” in Volume 1. The SIB byte has three fields— scale, index, 
and base —that define the scale factor, index-register number, and base-register number for 32-bit and 
64-bit complex addressing modes. In 64-bit mode, the REX.B and REX.X bits extend the encoding of 
the SIB byte’s base and index fields. 
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Bits: 


7 6 5 4 3 2 1 0 


scale 


index 


base 


REX.X bit of REX prefix can 
extend this field to 4 bits 



REX.B bit of REX prefix can 
extend this field to 4 bits 


513-306.eps 


Figure A-3. SIB Byte Format 

Table A-36 shows the encodings for the SIB byte’s base field, which specifies the base register for 
addressing. Table A-37 on page 503 shows the encodings for the effective address referenced by a 
complete SIB byte, including its scale and index fields. The /0 through /7 notation for the SIB base 
field means that the three-bit field contains a value between zero (binary 000) and 7 (binary 111). 


Table A-36. Addressing Modes: SIB base Field Encoding 


REX.B Bit 

ModRM mod Field 

SIB base Field 

10 

/ 1 

12 

13 

14 

15 

16 

17 

0 

00 

[rAX] 

[rCX] 

[rDX] 

[rBX] 

[rSP] 

disp32 

[rSI] 

[rDI] 

01 

[rBP] + disp8 

10 

[rBP] + disp32 


00 






disp32 



1 

01 

[r8] 

[rS] 

[rIO] 

[rll] 

[H2] 

[rl 3] + disp8 

[rl 4] 

[rl 5] 


10 






[r13] + disp32 
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Table A-37. Addressing Modes: SIB Byte Encoding 


Effective Address 

SIB 

scale 

Field 

SIB 

index 

Field 

SIB base Field 1 

REX.B = 0 

rAX 

rCX 

rDX 

rBX 

rSP 

note 

1 

rSI 

rDI 

REX.B = 1 

r8 

r9 

rIO 

rll 

r12 

note 

1 

r14 

r15 


10 

/I 

12 

13 

14 

15 

16 

17 

REX.X = 0 

REX.X = 1 

Complete SIB Byte (hex) 

[rAX] + [base] 

[r8] + [base] 

00 

000 


00 

01 

02 

03 

04 

05 

06 

07 

[rCX] + [base] 

[r9] + [base] 

001 

08 

09 

OA 

OB 

OC 

0D 

0E 

OF 

[rDX] + [base] 

[rl 0] + [base] 

010 

10 

11 

12 

13 

14 

15 

16 

17 

[rBX] + [base] 

[rll] + [base] 

Oil 

18 

19 

1A 

IB 

1C 

ID 

IE 

IF 

[base] 

[rl 2] + [base] 

100 

20 

21 

22 

23 

24 

25 

26 

27 

[rBP] + [base] 

[rl 3] + [base] 

101 

28 

29 

2A 

2B 

2C 

2D 

2E 

2F 

[rSI] + [base] 

[rl 4] + [base] 

110 

30 

31 

32 

33 

34 

35 

36 

37 

[rDI] + [base] 

[rl 5] + [base] 

111 

38 

39 

3A 

3B 

3C 

3D 

3E 

3F 

[rAX] * 2 + [base] 

[r8] * 2 + [base] 

01 

000 


40 

41 

42 

43 

44 

45 

46 

47 

[rCX] * 2 + [base] 

[r9] * 2 + [base] 

001 

48 

49 

4A 

4B 

4C 

4D 

4E 

4F 

[rDX] * 2 + [base] 

[rIO] * 2 + [base] 

010 

50 

51 

52 

53 

54 

55 

56 

57 

[rBX] * 2 + [base] 

[rll] *2 + [base] 

Oil 

58 

59 

5A 

5B 

5C 

5D 

5E 

5F 

[base] 

[r12] *2 +[base] 

100 

60 

61 

62 

63 

64 

65 

66 

67 

[rBP] * 2 + [base] 

[r13] *2 + [base] 

101 

68 

69 

6A 

6B 

6C 

6D 

6E 

6F 

[rSI] * 2 + [base] 

[rl 4] * 2 + [base] 

110 

70 

71 

72 

73 

74 

75 

76 

77 

[rDI] * 2 + [base] 

[r15] *2 +[base] 

111 

78 

79 

7 A 

7B 

7C 

7D 

7E 

7F 

[rAX] * 4 + [base] 

[r8] * 4 + [base] 

10 

000 


80 

81 

82 

83 

84 

85 

86 

87 

[rCX] * 4 + [base] 

[r9] * 4 + [base] 

001 

88 

89 

8A 

8B 

8C 

8D 

8E 

8F 

[rDX] * 4 + [base] 

[rl 0] * 4 + [base] 

010 

90 

91 

92 

93 

94 

95 

96 

97 

[rBX] * 4 + [base] 

[rll] *4 + [base] 

Oil 

98 

99 

9A 

9B 

9C 

9D 

9E 

9F 

[base] 

[r12] *4 +[base] 

100 

A0 

A1 

A2 

A3 

A4 

A5 

A6 

A7 

[rBP]*4+[base] 

[r13] *4 +[base] 

101 

A8 

A9 

AA 

AB 

AC 

AD 

AE 

AF 

[rSI]*4+[base] 

[rl4] *4 + [base] 

110 

B0 

B1 

B2 

B3 

B4 

B5 

B6 

B7 

[rDI]*4+[base] 

[rl5] *4 + [base] 

111 

B8 

B9 

BA 

BB 

BC 

BD 

BE 

BF 


Notes: 

1. See Table A-36 on page 502 for complete specification of SIB base field. 
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Table A-37. Addressing Modes: SIB Byte Encoding (continued) 


Effective Address 

SIB 

scale 

Field 

SIB 

index 

Field 

SIB base Field 1 

REX.B = 0 

rAX 

rCX 

rDX 

rBX 

rSP 

note 

1 

rSI 

rDI 

REX.B = 1 

r8 

r9 

rIO 

rll 

r12 

note 

1 

r14 

r15 


10 

/I 

12 

13 

14 

15 

16 

17 

REX.X = 0 

REX.X = 1 

Complete SIB Byte (hex) 

[rAX] * 8 + [base] 

[r8] * 8 + [base] 

11 

000 


CO 

Cl 

C2 

C3 

C4 

C5 

C6 

Cl 

[rCX] * 8 + [base] 

[r9] * 8 + [base] 

001 

C8 

C9 

CA 

CB 

CC 

CD 

CE 

CF 

[rDX] * 8 + [base] 

[rl 0] * 8 + [base] 

010 

DO 

D1 

D2 

D3 

D4 

D5 

D6 

D7 

[rBX] * 8 + [base] 

[r11]*8 + [base] 

Oil 

D8 

D9 

DA 

DB 

DC 

DD 

DE 

DF 

[base] 

[r12]*8 +[base] 

100 

E0 

El 

E2 

E3 

E4 

E5 

E6 

E7 

[rBP] * 8 + [base] 

[r13]*8 + [base] 

101 

E8 

E9 

EA 

EB 

EC 

ED 

EE 

EF 

[rSI] * 8 + [base] 

[rl 4] * 8 + [base] 

110 

F0 

FI 

F2 

F3 

F4 

F5 

F6 

F7 

[rDI] * 8 + [base] 

[rl 5] * 8 + [base] 

111 

F8 

F9 

FA 

FB 

FC 

FD 

FE 

FF 


Notes: 

1. See Table A-36 on page 502 for complete specification of SIB base field. 
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Appendix B General-Purpose Instructions in 

64-Bit Mode 


This appendix provides details of the general-purpose instructions in 64-bit mode and its differences 
from legacy and compatibility modes. The appendix covers only the general-purpose instructions 
(those described in Chapter 3, “General-Purpose Instruction Reference”). It does not cover the 128-bit 
media, 64-bit media, or x87 floating-point instructions because those instructions are not affected by 
64-bit mode, other than in the access by such instructions to extended GPR and XMM registers when 
using a REX prefix. 

B.1 General Rules for 64-Bit Mode 

In 64-bit mode, the following general rules apply to instructions and their operands: 

• “Promoted to 64 Bit”: If an instruction’s operand size (16-bit or 32-bit) in legacy and 
compatibility modes depends on the CS.D bit and the operand-size override prefix, then the 
operand-size choices in 64-bit mode are extended from 16-bit and 32-bit to include 64 bits (with a 
REX prefix), or the operand size is fixed at 64 bits. Such instructions are said to be “ Promoted to 
64 bits ” in Table B-l. However, byte-operand opcodes of such instructions are not promoted. 

• Byte-Operand Opcodes Not Promoted: As stated above in “Promoted to 64 Bit”, byte-operand 
opcodes of promoted instructions are not promoted. Those opcodes continue to operate only on 
bytes. 

• Fixed Operand Size: If an instruction’s operand size is fixed in legacy mode (thus, independent of 
CS.D and prefix overrides), that operand size is usually fixed at the same size in 64-bit mode. For 
example, CPUID operates on 32-bit operands, irrespective of attempts to override the operand 
size. 

• Default Operand Size: The default operand size for most instructions is 32 bits, and a REX prefix 
must be used to change the operand size to 64 bits. However, two groups of instructions default to 
64-bit operand size and do not need a REX prefix: (1) near branches and (2) all instructions, except 
far branches, that implicitly reference the RSR See Table B-5 on page 533 for a list of all 
instructions that default to 64-bit operand size. 

• Zero-Extension of 32-Bit Results: Operations on 32-bit operands in 64-bit mode zero-extend the 
high 32 bits of 64-bit GPR destination registers. 

• No Extension of 8-Bit and 16-Bit Results: Operations on 8-bit and 16-bit operands in 64-bit 
mode leave the high 56 or 48 bits, respectively, of 64-bit GPR destination registers unchanged. 

• Shift and Rotate Counts: When the operand size is 64 bits, shifts and rotates use one additional 
bit (6 bits total) to specify shift-count or rotate-count, allowing 64-bit shifts and rotates. 

• Immediates: The maximum size of immediate operands is 32 bits, except that 64-bit immediates 
can be MOVed into 64-bit GPRs. Immediates that are less than 64 bits are a maximum of 32 bits, 
and are sign-extended to 64 bits during use. 
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• Displacements and Offsets: The maximum size of an address displacement or offset is 32 bits, 
except that 64-bit offsets can be used by specific MOV opcodes that read or write AL or rAX. 
Displacements and offsets that are less than 64 bits are a maximum of 32 bits, and are sign- 
extended to 64 bits during use. 

• Undefined High 32 Bits After Mode Change: The processor does not preserve the upper 32 bits 
of the 64-bit GPRs across switches from 64-bit mode to compatibility or legacy modes. In 
compatibility or legacy mode, the upper 32 bits of the GPRs are undefined and not accessible to 
software. 


B.2 Operation and Operand Size in 64-Bit Mode 

Table B-l lists the integer instructions, showing operand size in 64-bit mode and the state of the high 
32 bits of destination registers when 32-bit operands are used. Opcodes, such as byte-operand versions 
of several instructions, that do not appear in Table B-l are covered by the general rules described in 
“General Rules for 64-Bit Mode” on page 505. 


Table B-1. Operations and Operands in 64-Bit Mode 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

AAA - ASCII Adjust after Addition 

37 

INVALID IN 64-BIT MODE (invalid-opcode exception) 

AAD - ASCII Adjust AX before Division 

D5 

INVALID IN 64-BIT MODE (invalid-opcode exception) 

AAM - ASCII Adjust AX after Multiply 

D4 

INVALID IN 64-BIT MODE (invalid-opcode exception) 

AAS - ASCII Adjust AL after Subtraction 

3F 

INVALID IN 64-BIT MODE (invalid-opcode exception) 

Notes: 

1. See “General Rules for 64-Bit Mode” on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

ADC— Add with Carry 

11 

13 

15 

81 12 

83 12 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


ADD — Signed or Unsigned Add 

01 

03 

05 

81 /0 

83 10 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


AND — Logical AND 

21 

23 

25 

81 /4 

83 14 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


ARPL - Adjust Requestor Privilege Level 

63 

OPCODE USED as MOVSXD in 64-BIT MODE 

BOUND - Check Array Against Bounds 

62 

INVALID IN 64-BIT MODE (invalid-opcode exception) 

BSF — Bit Scan Forward 

OF BC 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


Notes: 

1. See “General Rules for 64-Bit Mode” on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

BSR —Bit Scan Reverse 

OF BD 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


BSWAP —Byte Swap 

OF C8 through OF CF 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 

Swap all 8 bytes 
of a 64-bit GPR. 

BT —Bit Test 

OF A3 

OF BAM 

Promoted to 
64 bits. 

32 bits 

No GPR register results. 

BTC —Bit Test and Complement 

OF BB 

OF BA/7 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


BTR —Bit Test and Reset 

OF B3 

OF BA/6 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


BTS —Bit Test and Set 

OFAB 

OF BA/5 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


CALL —Procedure Call Near 

See “Near Branches in 64-Bit Mode” in Volume 1. 

E8 

Promoted to 
64 bits. 

64 bits 

Can’t encode. 6 

RIP = RIP + 32- 
bit displacement 
sign-extended to 
64 bits. 

FF/2 

Promoted to 
64 bits. 

64 bits 

Can’t encode. 6 

RIP = 64-bit 
offset from 
register or 
memory. 

Notes: 

1. See “General Rules for 64-Bit Mode” on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

CALL —Procedure Call Far 

9A 

FF 13 

See “Branches to 64-Bit Offsets” in Volume 1. 

INVALID IN 64-BIT MODE (invalid-opcode exception) 

Promoted to 
64 bits. 

32 bits 

If selector points to a gate, then 

RIP = 64-bit offset from gate, else 

RIP = zero-extended 32-bit offset 
from far pointer referenced in 
instruction. 

CBW, CWDE, CDQE— Convert Byte to 
Word, Convert Word to Doubleword, 
Convert Doubleword to Quadword 

98 

Promoted to 
64 bits. 

32 bits 

(size of desti¬ 
nation regis¬ 
ter) 

CWDE: Converts 
word to 
doubleword. 

Zero-extends 

EAX to RAX. 

CDQE (new 
mnemonic): 
Converts 
doubleword to 
quadword. 

RAX = sign- 
extended EAX. 

CDQ 

see CWD, CDQ, CQO 

CDQE (new mnemonic) 

see CBW, CWDE, CDQE 

CD WE 

see CBW, CWDE, CDQE 

CLC —Clear Carry Flag 

F8 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

CLD —Clear Direction Flag 

FC 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

CLFLUSH —Cache Line Invalidate 

OF AE 17 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

CLGI —Clear Global Interrupt 

OF 01 DD 

Same as 
legacy mode 

Not relevant 

No GPR register results. 

CLI —Clear Interrupt Flag 

FA 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 


Notes: 


1. See “General Rules for 64-Bit Mode’’ on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

CLTS —Clear Task-Switched Flag in 

CRO 

OF 06 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

CMC —Complement Carry Flag 

F5 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

CMOVcc —Conditional Move 

OF 40 through OF 4F 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 
This occurs even 
if the condition is 
false. 


CMP —Compare 

39 

3B 

3D 

81 17 

83 /7 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


CMPS, CMPSW, CMPSD, CMPSQ— 

Compare Strings 

A7 

Promoted to 
64 bits. 

32 bits 

CMPSD: 

Compare String 
Doublewords. 

See footnote 5 

CMPSQ (new 
mnemonic): 
Compare String 
Quadwords 

See footnote 5 

CMPXCHG —Compare and Exchange 

OF B1 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


Notes: 

1. See “General Rules for 64-Bit Mode” on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits , a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

CMPXCHG8B — Compare and 

Exchange Eight Bytes 

OF C7 /I 

Same as 
legacy mode. 

32 bits. 

Zero-extends 

EDX and EAX to 
64 bits. 

CMPXCHG16B 
(new mne¬ 
monic): Com¬ 
pare and 
Exchange 16 
Bytes. 

CPUID — Processor Identification 

OF A2 

Same as 
legacy mode. 

Operand size 
fixed at 32 
bits. 

Zero-extends 32-bit register results 
to 64 bits. 

CQO (new mnemonic) 

see CWD, CDQ, CQO 

CWD, CDQ, CQO — Convert Word to 
Doubleword, Convert Doubleword to 
Quadword, Convert Quadword to Double 
Quadword 

99 

Promoted to 
64 bits. 

32 bits 

(size of desti¬ 
nation regis¬ 
ter) 

CDQ: Converts 
doubleword to 
quadword. 
Sign-extends 

EAX to EDX. 
Zero-extends 

EDX to RDX. 

RAX is 
unchanged. 

CQO (new 

mnemonic): 

Converts 

quadword to 

double 

quadword. 

Sign-extends 

RAX to RDX. 

RAX is 
unchanged. 

DAA - Decimal Adjust AL after Addition 

27 

INVALID IN 64-BIT MODE (invalid-opcode exception) 

DAS - Decimal Adjust AL after 

Subtraction 

2F 

INVALID IN 64-BIT MODE (invalid-opcode exception) 

Notes: 

1. See “General Rules for 64-Bit Mode” on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Type of 

Default 

Operand 

Size 3 

For 32-Bit 

For 64-Bit 

Opcode (hex) 1 

Operation 2 

Operand Size 4 

Operand Size 4 

DEC —Decrement by 1 

FF/1 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


48 through 4F 

OPCODE USED as REX PREFIX in 64-BIT MODE 

DIV —Unsigned Divide 



Zero-extends 32- 
bit register 
results to 64 bits. 

RDX:RAX 
contain a 64-bit 

F7/6 

Promoted to 
64 bits. 

32 bits 

quotient (RAX) 
and 64-bit 
remainder 
(RDX). 

ENTER —Create Procedure Stack 

Frame 

C8 

Promoted to 
64 bits. 

64 bits 

Can’t encode 6 


HLT— Halt 

F4 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

IDIV —Signed Divide 



Zero-extends 32- 
bit register 
results to 64 bits. 

RDX: RAX 
contain a 64-bit 

F7 n 

Promoted to 
64 bits. 

32 bits 

quotient (RAX) 
and 64-bit 
remainder 
(RDX). 

Notes: 

1. See “General Rules for 64-Bit Mode” on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits , a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDi, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Type of 

Default 

Operand 

Size 3 

For 32-Bit 

For 64-Bit 

Opcode (hex) 1 

Operation 2 

Operand Size 4 

Operand Size 4 

IMUL - Signed Multiply 

F7/5 




RDX:RAX = RAX 
* reg/mem64 
(i.e., 128-bit 
result) 

OF AF 

Promoted to 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 

reg64 = reg64 * 
reg/mem64 

69 

64 bits. 

reg64 = 
reg/mem64 * 
imm32 

6B 




reg64 = 
reg/mem64 * 
imm8 

IN —Input From Port 

E5 

ED 

Same as 
legacy mode. 

32 bits 

Zero-extends 32-bit register results 
to 64 bits. 

INC —Increment by 1 

FF 10 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


40 through 47 

OPCODE USED as REX PREFIX in 64-BIT MODE 

INS, INSW, INSD— Input String 

6D 

Same as 
legacy mode. 

32 bits 

INSD: Input String Doublewords. 

No GPR register results. 

See footnote 5 

INT n—Interrupt to Vector 





CD 

Promoted to 

Not relevant. 

See “Long-Mode Interrupt Control 

INT3 —Interrupt to Debug Vector 

64 bits. 

Transfers” in Volume 2. 

CC 





INTO - Interrupt to Overflow Vector 

CE 

INVALID IN 64-BIT MODE (invalid-opcode exception) 


Notes: 


1. See “General Rules for 64-Bit Mode’’ on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Type of 

Default 

Operand 

Size 3 

For 32-Bit 

For 64-Bit 

Opcode (hex) 1 

Operation 2 

Operand Size 4 

Operand Size 4 

INVD —Invalidate Internal Caches 

OF 08 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

INVLPG —Invalidate TLB Entry 

OF 01 n 

Promoted to 
64 bits. 

Not relevant. 

No GPR register results. 

INVLPGA —Invalidate TLB Entry in a 
Specified ASID 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

IRET, IRETD, IRETQ— Interrupt Return 

CF 

Promoted to 
64 bits. 

32 bits 

IRETD: Interrupt 

Return 

Doubleword. 

See “Long-Mode 
Interrupt Control 
Transfers” in 
Volume 2. 

IRETQ (new 
mnemonic): 
Interrupt Return 
Quadword. 

See “Long-Mode 
Interrupt Control 
Transfers” in 
Volume 2. 

Jcc —Jump Conditional 

See “Near Branches in 64-Bit Mode” in Volume 1. 

70 through 7F 

Promoted to 

64 bits 

Can’t encode. 6 

RIP = RIP + 8-bit 
displacement 
sign-extended to 
64 bits. 

OF 80 through OF 8F 

64 bits. 

RIP = RIP + 32- 
bit displacement 
sign-extended to 
64 bits. 

JCXZ, JECXZ, JRCXZ— Jump on 
CX/ECX/RCX Zero 

E3 

Promoted to 
64 bits. 

64 bits 

Can’t encode. 6 

RIP = RIP + 8-bit 
displacement 
sign-extended to 
64 bits. 

See footnote 5 

Notes: 

1. See “General Rules for 64-Bit Mode” on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

JMP —Jump Near 

EB 

E9 

FF/4 

See “Near Branches in 64-Bit Mode” in Volume 1. 

Promoted to 
64 bits. 

64 bits 

Can’t encode. 6 

RIP = RIP + 8-bit 
displacement 
sign-extended to 
64 bits. 

RIP = RIP + 32- 
bit displacement 
sign-extended to 
64 bits. 

RIP = 64-bit 
offset from 
register or 
memory. 

JMP —Jump Far 

EA 

FF 15 

See “Branches to 64-Bit Offsets” in Volume 1. 

INVALID IN 64-BIT MODE (invalid-opcode exception) 

Promoted to 
64 bits. 

32 bits 

If selector points to a gate, then 

RIP = 64-bit offset from gate, else 

RIP = zero-extended 32-bit offset 
from far pointer referenced in 
instruction. 

LAHF - Load Status Flags into AH 
Register 

9F 

Same as leg¬ 
acy mode. 

Not relevant. 


LAR —Load Access Rights Byte 

OF 02 

Same as 
legacy mode. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


LDS - Load DS Far Pointer 

C5 

INVALID IN 64-BIT MODE (invalid-opcode exception) 


Notes: 


1. See “General Rules for 64-Bit Mode’’ on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode’’ on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Type of 

Default 

Operand 

Size 3 

For 32-Bit 

For 64-Bit 

Opcode (hex) 1 

Operation 2 

Operand Size 4 

Operand Size 4 

LEA — Load Effective Address 

Promoted to 
64 bits. 


Zero-extends 32- 


8D 

32 bits 

bit register 
results to 64 bits. 


LEAVE —Delete Procedure Stack Frame 

C9 

Promoted to 
64 bits. 

64 bits 

Can’t encode 6 


LES - Load ES Far Pointer 

C4 

INVALID IN 64-BIT MODE (invalid-opcode exception) 

LFENCE — Load Fence 

OF AE 15 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

LFS — Load FS Far Pointer 

OF B4 

Same as 
legacy mode. 

32 bits 

Zero-extends 32-bit register results 
to 64 bits. 

LGDT —Load Global Descriptor Table 
Register 

OF 01 12 

Promoted to 
64 bits. 

Operand size 
fixed at 64 
bits. 

No GPR register results. 

Loads 8-byte base and 2-byte limit. 

LGS —Load GS Far Pointer 

OF B5 

Same as 
legacy mode. 

32 bits 

Zero-extends 32-bit register results 
to 64 bits. 

LIDT — Load Interrupt Descriptor Table 
Register 

OF 01 13 

Promoted to 
64 bits. 

Operand size 
fixed at 64 
bits. 

No GPR register results. 

Loads 8-byte base and 2-byte limit. 

LLDT — Load Local Descriptor Table 
Register 

OF 00 12 

Promoted to 
64 bits. 

Operand size 
fixed at 16 
bits. 

No GPR register results. 

References 16-byte descriptor to 
load 64-bit base. 

LMSW — Load Machine Status Word 

OF 01 16 

Same as 
legacy mode. 

Operand size 
fixed at 16 
bits. 

No GPR register results. 

Notes: 

1. See “General Rules for 64-Bit Mode” on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

LODS, LODSW, LODSD, LODSQ — 

Load String 

AD 

Promoted to 
64 bits. 

32 bits 

LODSD: Load 
String 

Doublewords. 
Zero-extends 32- 
bit register 
results to 64 bits. 

See footnote 5 

LODSQ (new 
mnemonic): Load 
String 

Quadwords. 

See footnote 5 

LOOP —Loop 

E2 

Promoted to 
64 bits. 

64 bits 

Can’t encode. 6 

RIP = RIP + 8-bit 
displacement 
sign-extended to 
64 bits. 

See footnote 5 

LOOPZ, LOOPE —Loop if Zero/Equal 

El 

LOOPNZ, LOOPNE— Loop if Not 
Zero/Equal 

EO 

LSL — Load Segment Limit 

OF 03 

Same as 
legacy mode. 

32 bits 

Zero-extends 32-bit register results 
to 64 bits. 

LSS —Load SS Segment Register 

OF B2 

Same as 
legacy mode. 

32 bits 

Zero-extends 32-bit register results 
to 64 bits. 

LTR — Load Task Register 

OF 00 13 

Promoted to 
64 bits. 

Operand size 
fixed at 16 
bits. 

No GPR register results. 

References 16-byte descriptor to 
load 64-bit base. 

LZCNT — Count Leading Zeros 

F3 OF BD 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32-bit register results 
to 64 bits. 

MFENCE — Memory Fence 

OF AE /6 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

MONITOR — Setup Monitor Address 

OF 01 C8 

Same as 
legacy mode. 

Operand size 
fixed at 32 
bits. 

No GPR register results. 

Notes: 

1. See “General Rules for 64-Bit Mode” on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Type of 

Default 

Operand 

Size 3 

For 32-Bit 

For 64-Bit 

Opcode (hex) 1 

Operation 2 

Operand Size 4 

Operand Size 4 

MOV —Move 





89 





8B 



Zero-extends 32- 
bit register 
results to 64 bits. 


Cl 

Promoted to 
64 bits. 


32-bit immediate 
is sign-extended 
to 64 bits. 

B8 through BF 

32 bits 


64-bit immediate. 

A1 (moffset) 

A3 (moffset) 


Zero-extends 32- 
bit register 
results to 64 bits. 
Memory offsets 
are address¬ 
sized and default 
to 64 bits. 

Memory offsets 
are address¬ 
sized and default 
to 64 bits. 

MOV —Move to/from Segment Registers 

8C 

Same as 

32 bits 

Zero-extends 32-bit register results 
to 64 bits. 

8E 

legacy mode. 

Operand size 
fixed at 16 
bits. 

No GPR register results. 

MOV(CRn) —Move to/from Control 
Registers 

OF 22 

OF 20 

Promoted to 
64 bits. 

Operand size 
fixed at 64 
bits. 

The high 32 bits of control registers 
differ in their writability and reserved 
status. See “System Resources” in 
Volume 2 for details. 

MOV(DRn) —Move to/from Debug 
Registers 

OF 21 

OF 23 

Promoted to 
64 bits. 

Operand size 
fixed at 64 
bits. 

The high 32 bits of debug registers 
differ in their writability and reserved 
status. See “Debug and 

Performance Resources” in 

Volume 2 for details. 


Notes: 


1. See “General Rules for 64-Bit Mode’’ on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode’’ on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDi, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

MOVD —Move Doubleword or 

Quadword 

OF 6E 

OF 7E 

66 OF 6E 

66 OF 7E 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


Zero-extends 32- 
bit register 
results to 128 
bits. 

Zero-extends 64- 
bit register 
results to 128 
bits. 

MOVNTI —Move Non-Temporal 
Doubleword 

OF C3 

Promoted to 
64 bits. 

32 bits 

No GPR register results. 

MOVS, MOVSW, MOVSD, MOVSQ— 

Move String 

A5 

Promoted to 
64 bits. 

32 bits 

MOVSD: Move 
String 

Doublewords. 

See footnote 5 

MOVSQ (new 
mnemonic): 

Move String 
Quadwords. 

See footnote 5 

MOVSX —Move with Sign-Extend 

OF BE 

OF BF 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 

Sign-extends 
byte to 
quadword. 

Sign-extends 
word to 
quadword. 


Notes: 


1. See “General Rules for 64-Bit Mode’’ on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode’’ on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

MOVSXD — Move with Sign-Extend 
Doubleword 

63 

New 

instruction, 
available only 
in 64-bit 
mode. (In 
other modes, 
this opcode 
isARPL 
instruction.) 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 

Sign-extends 
doubleword to 
quadword. 

MOVZX — Move with Zero-Extend 

OF B6 

OF B7 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 

Zero-extends 
byte to 
quadword. 

Zero-extends 
word to 
quadword. 

MUL — Multiply Unsigned 

F7/4 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 

RDX:RAX=RAX* 
quadword in 
register or 
memory. 

MWAIT — Monitor Wait 

OF 01 C9 

Same as 
legacy mode. 

Operand size 
fixed at 32 
bits. 

No GPR register results. 

NEG — Negate Two’s Complement 

F7/3 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


NOP — No Operation 

90 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

NOT — Negate One’s Complement 

F7/2 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


Notes: 

1. See “General Rules for 64-Bit Mode” on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

OR —Logical OR 

09 

OB 

0D 

81 /I 

83/1 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


OUT —Output to Port 

E7 

EF 

Same as 
legacy mode. 

32 bits 

No GPR register results. 

OUTS, OUTSW, OUTSD— Output String 

6F 

Same as 
legacy mode. 

32 bits 

Writes doubleword to I/O port. 

No GPR register results. 

See footnote 5 

PAUSE —Pause 

F3 90 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

POP —Pop Stack 

8F/0 

58 through 5F 

Promoted to 
64 bits. 

64 bits 

Cannot encode 6 

No GPR register 
results. 

POP —Pop (segment register from) 

Stack 

OF A1 (POP FS) 

OF A9 (POPGS) 

IF (POP DS) 

07 (POP ES) 

17 (POPSS) 

Same as 
legacy mode. 

64 bits 

Cannot encode 6 

No GPR register 
results. 

INVALID IN 64-BIT MODE (invalid-opcode exception) 


Notes: 


1. See “General Rules for 64-Bit Mode’’ on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

POPA, POPAD— Pop All to GPR Words 
or Doublewords 

61 

INVALID IN 64-BIT MODE (invalid-opcode exception) 

POPCNT — Bit Population Count 

F3 OF B8 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32-bit register results 
to 64 bits. 

POPF, POPFD, POPFQ— Pop to 

rFLAGS Word, Doublword, or Quadword 

9D 

Promoted to 
64 bits. 

64 bits 

Cannot encode 6 

POPFO (new 
mnemonic): Pops 
64 bits off stack, 
writes low 32 bits 
into EFLAGS and 
zero-extends the 
high 32 bits of 
RFLAGS. 

PREFETCH— Prefetch LI Data-Cache 
Line 

OF OD /0 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

PREFETCH/eve/ — Prefetch Data to 
Cache Level level 

OF 18/0-3 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

PREFETCHW— Prefetch LI Data-Cache 
Line for Write 

OF OD /I 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

PUSH — Push onto Stack 

FF 16 

50 through 57 

6A 

68 

Promoted to 
64 bits. 

64 bits 

Cannot encode 6 


Notes: 

1. See “General Rules for 64-Bit Mode” on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

PUSH —Push (segment register) onto 
Stack 

OF AO (PUSH FS) 

OF A8 (PUSH GS) 

OE (PUSH CS) 

IE (PUSH DS) 

06 (PUSH ES) 

16 (PUSH SS) 

Promoted to 
64 bits. 

64 bits 

Cannot encode 6 


INVALID IN 64-BIT MODE (invalid-opcode exception) 

PUSHA, PUSHAD - Push All to GPR 
Words or Doublewords 

60 

INVALID IN 64-BIT MODE (invalid-opcode exception) 

PUSHF, PUSHFD, PUSHFQ— Push 
rFLAGS Word, Doubleword, or 

Quadword onto Stack 

9C 

Promoted to 
64 bits. 

64 bits 

Cannot encode 6 

PUSHFQ (new 
mnemonic): 
Pushes the 64-bit 
RFLAGS 
register. 

RCL —Rotate Through Carry Left 

D1 12 

D3/2 

Cl 12 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 

Uses 6-bit count. 

RCR —Rotate Through Carry Right 

D1 13 

D3 13 

Cl 13 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 

Uses 6-bit count. 

RDMSR —Read Model-Specific Register 

OF 32 

Same as 
legacy mode. 

Not relevant. 

RDX[31:0] contains MSR[63:32], 
RAX[31:0] contains MSR[31:0], 
Zero-extends 32-bit register results 
to 64 bits. 


Notes: 


1. See “General Rules for 64-Bit Mode’’ on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDi, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

RDPMC — Read Performance- 
Monitoring Counters 

OF 33 

Same as 
legacy mode. 

Not relevant. 

RDX[31:0] contains PMC[63:32], 
RAX[31:0] contains PMC[31:0], 
Zero-extends 32-bit register results 
to 64 bits. 

RDTSC — Read Time-Stamp Counter 

OF 31 

Same as 
legacy mode. 

Not relevant. 

RDX[31:0] contains TSC[63:32], 

RAX[31:0] contains TSC[31:0], 
Zero-extends 32-bit register results 
to 64 bits. 

RDTSCP — Read Time-Stamp Counter 
and Processor ID 

OF 01 F9 

Same as 
legacy mode. 

Not relevant. 

RDX[31:0] contains TSC[63:32], 

RAX[31:0] contains TSC[31:0], 
RCX[31:0] contains the TSC AUX 
MSR C000_0103h[31:0], Zero- 
extends 32-bit register results to 64 
bits. 

REP INS — Repeat Input String 

F3 6D 

Same as 
legacy mode. 

32 bits 

Reads doubleword I/O port. 

See footnote 5 

REP LODS — Repeat Load String 

F3 AD 

Promoted to 
64 bits. 

32 bits 

Zero-extends 

EAX to 64 bits. 

See footnote 5 

See footnote 5 

REP MOVS — Repeat Move String 

F3 A5 

Promoted to 
64 bits. 

32 bits 

No GPR register results. 

See footnote 5 

REP OUTS —Repeat Output String to 

Port 

F3 6F 

Same as 
legacy mode. 

32 bits 

Writes doubleword to I/O port. 

No GPR register results. 

See footnote 5 

REP STOS —Repeat Store String 

F3 AB 

Promoted to 
64 bits. 

32 bits 

No GPR register results. 

See footnote 5 

REPx CMPS — Repeat Compare String 

F3 A7 

Promoted to 
64 bits. 

32 bits 

No GPR register results. 

See footnote 5 

Notes: 

1. See “General Rules for 64-Bit Mode” on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDi, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

REPx SCAS —Repeat Scan String 

F3 AF 

Promoted to 
64 bits. 

32 bits 

No GPR register results. 

See footnote 5 

RET —Return from Call Near 

C2 

C3 

See “Near Branches in 64-Bit Mode” in Volume 1. 

Promoted to 
64 bits. 

64 bits 

Cannot encode. 6 

No GPR register 
results. 

RET —Return from Call Far 

CB 

CA 

Promoted to 
64 bits. 

32 bits 

See “Control Transfers” in Volume 1 
and “Control-Transfer Privilege 
Checks” in Volume 2. 

ROL —Rotate Left 

D1 10 

D3 10 

Cl 10 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 

Uses 6-bit count. 

ROR —Rotate Right 

D1 /I 

D3/1 

Cl /I 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 

Uses 6-bit count. 

RSM —Resume from System 
Management Mode 

OF AA 

New SMM 
state-save 

area. 

Not relevant. 

See “System-Management Mode” in 
Volume 2. 

SAHF —Store AH into Flags 

9E 

Same as leg¬ 
acy mode. 

Not relevant. 

No GPR register results. 

SAL —Shift Arithmetic Left 

D1 14 

D3/4 

Cl 14 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 

Uses 6-bit count. 


Notes: 


1. See “General Rules for 64-Bit Mode’’ on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

SAR — Shift Arithmetic Right 

D1 17 

D3 n 

Cl 17 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 

Uses 6-bit count. 

SBB — Subtract with Borrow 

19 

IB 

ID 

81 13 

83 13 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


SCAS, SCASW, SCASD, SCASQ— 

Scan String 

AF 

Promoted to 
64 bits. 

32 bits 

SCASD: Scan 
String 

Doublewords. 

Zero-extends 32- 
bit register 
results to 64 bits. 

See footnote 5 

SCASQ (new 
mnemonic): Scan 
String 

Quadwords. 

See footnote 5 

SFENCE — Store Fence 

OF AE 17 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

SGDT — Store Global Descriptor Table 
Register 

OF 01 10 

Promoted to 
64 bits. 

Operand size 
fixed at 64 
bits. 

No GPR register results. 

Stores 8-byte base and 2-byte limit. 

SHL— Shift Left 

D1 14 

D3/4 

Cl 14 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 

Uses 6-bit count. 

Notes: 

1. See “General Rules for 64-Bit Mode” on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

SHLD — Shift Left Double 

OF A4 

OF A5 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 

Uses 6-bit count. 

SHR— Shift Right 

D1 15 

D3 15 

Cl 15 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 

Uses 6-bit count. 

SHRD — Shift Right Double 

OF AC 

OF AD 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 

Uses 6-bit count. 

SIDT — Store Interrupt Descriptor Table 
Register 

OF 01 /I 

Promoted to 
64 bits. 

Operand size 
fixed at 64 
bits. 

No GPR register results. 

Stores 8-byte base and 2-byte limit. 

SKINIT — Secure Init and Jump with 
Attestation 

OF 01 DE 

Same as 
legacy mode. 

Not relevant 

Zero-extends 32- 
bit register 
results to 64 bits. 


SLDT —Store Local Descriptor Table 
Register 

OF 00 10 

Same as 
legacy mode. 

32 

Zero-extends 2-byte LDT selector to 
64 bits. 

SMSW — Store Machine Status Word 

OF 01 /4 

Same as 
legacy mode. 

32 

Zero-extends 32- 
bit register 
results to 64 bits. 

Stores 64-bit 
machine status 
word (CR0). 

STC — Set Carry Flag 

F9 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

STD — Set Direction Flag 

FD 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

Notes: 

1. See “General Rules for 64-Bit Mode” on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

STGI — Set Global Interrupt Flag 

OF 01 DC 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

STI - Set Interrupt Flag 

FB 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

STOS, STOSW, STOSD, STOSQ- Store 
String 

AB 

Promoted to 
64 bits. 

32 bits 

STOSD: Store 
String 

Doublewords. 

See footnote 5 

STOSQ (new 
mnemonic): 

Store String 
Quadwords. 

See footnote 5 

STR — Store Task Register 

OF 00 /I 

Same as 
legacy mode. 

32 

Zero-extends 2-byte TR selector to 

64 bits. 

SUB — Subtract 

29 

2B 

2D 

81 15 

83 15 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


SWAPGS — Swap GS Register with 
KerneIGSbase MSR 

OF 01 /7 

New 

instruction, 
available only 
in 64-bit 
mode. (In 
other modes, 
this opcode 
is invalid.) 

Not relevant. 

See “SWAPGS Instruction” in 

Volume 2. 

SYSCALL — Fast System Call 

OF 05 

Promoted to 
64 bits. 

Not relevant. 

See “SYSCALL and SYSRET 
Instructions” in Volume 2 for details. 

Notes: 

1. See “General Rules for 64-Bit Mode” on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

SYSENTER— System Call 

OF 34 

INVALID IN LONG MODE (invalid-opcode exception) 

SYSEXIT — System Return 

OF 35 

INVALID IN LONG MODE (invalid-opcode exception) 

SYSRET — Fast System Return 

OF 07 

Promoted to 
64 bits. 

32 bits 

See “SYSCALL and SYSRET 
Instructions” in Volume 2 for details. 

TEST — Test Bits 

85 

A9 

F7 10 

Promoted to 
64 bits. 

32 bits 

No GPR register results. 

UD2 — Undefined Operation 

OF OB 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

VERR — Verify Segment for Reads 

OF 00 14 

Same as 
legacy mode. 

Operand size 
fixed at 16 
bits 

No GPR register results. 

VERW — Verify Segment for Writes 

OF 00 15 

Same as 
legacy mode. 

Operand size 
fixed at 16 
bits 

No GPR register results. 

VMLOAD — Load State from VMCB 

OF 01 DA 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

VMMCALL— Call VMM 

OF 01 D9 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

VMRUN — Run Virtual Machine 

OF 01 D8 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

VMSAVE— Save State to VMCB 

OF 01 DB 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

Notes: 

1. See “General Rules for 64-Bit Mode” on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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Table B-1. Operations and Operands in 64-Bit Mode (continued) 


Instruction and 

Opcode (hex) 1 

Type of 
Operation 2 

Default 

Operand 

Size 3 

For 32-Bit 
Operand Size 4 

For 64-Bit 
Operand Size 4 

WAIT — Wait for Interrupt 

9B 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

WBINVD — Writeback and Invalidate All 
Caches 

OF 09 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

WRMSR — Write to Model-Specific 
Register 

OF 30 

Same as 
legacy mode. 

Not relevant. 

No GPR register results. 

MSR[63:32] = RDX[31:0] 

MSR[31:0] = RAX[31:0] 

XADD — Exchange and Add 

OF Cl 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


XCHG — Exchange Register/Memory 
with Register 

87 

90 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


XOR — Logical Exclusive OR 

31 

33 

35 

81 16 

83 16 

Promoted to 
64 bits. 

32 bits 

Zero-extends 32- 
bit register 
results to 64 bits. 


Notes: 

1. See “General Rules for 64-Bit Mode” on page 505, for opcodes that do not appear in this table. 

2. The type of operation, excluding considerations of operand size or extension of results. See “General Rules for 64- 
Bit Mode” on page 505 for definitions of “Promoted to 64 bits” and related topics. 

3. If “Type of Operation” is 64 bits, a REX prefix is needed for 64-bit operand size, unless the instruction size defaults 
to 64 bits. If the operand size is fixed, operand-size overrides are silently ignored. 

4. Special actions in 64-bit mode, in addition to legacy-mode actions. Zero or sign extensions apply only to result oper¬ 
ands, not source operands. Unless otherwise stated, 8-bit and 16-bit results leave the high 56 or 48 bits, respec¬ 
tively, of 64-bit destination registers unchanged. Immediates and branch displacements are sign-extended to 64 
bits. 

5. Any pointer registers (rDI, rSI) or count registers (rCX) are address-sized and default to 64 bits. For 32-bit address 
size, any pointer and count registers are zero-extended to 64 bits. 

6. The default operand size can be overridden to 16 bits with 66h prefix, but there is no 32-bit operand-size override 
in 64-bit mode. 
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B.3 Invalid and Reassigned Instructions in 64-Bit Mode 

Table B-2 lists instructions that are illegal in 64-bit mode. Attempted use of these instructions 
generates an invalid-opcode exception (#UD). 


Table B-2. Invalid Instructions in 64-Bit Mode 


Mnemonic 

Opcode 

(hex) 

Description 

AAA 

37 

ASCII Adjust After Addition 

AAD 

D5 

ASCII Adjust Before Division 

AAM 

D4 

ASCII Adjust After Multiply 

AAS 

3F 

ASCII Adjust After Subtraction 

BOUND 

62 

Check Array Bounds 

CALL (far) 

9A 

Procedure Call Far (far absolute) 

DAA 

27 

Decimal Adjust after Addition 

DAS 

2F 

Decimal Adjust after Subtraction 

INTO 

CE 

Interrupt to Overflow Vector 

JMP (far) 

EA 

Jump Far (absolute) 

LDS 

C5 

Load DS Far Pointer 

LES 

C4 

Load ES Far Pointer 

POP DS 

IF 

Pop Stack into DS Segment 

POP ES 

07 

Pop Stack into ES Segment 

POP SS 

17 

Pop Stack into SS Segment 

POPA, POPAD 

61 

Pop All to GPR Words or Doublewords 

PUSH CS 

0E 

Push CS Segment Selector onto Stack 

PUSH DS 

IE 

Push DS Segment Selector onto Stack 

PUSH ES 

06 

Push ES Segment Selector onto Stack 

PUSH SS 

16 

Push SS Segment Selector onto Stack 

PUSHA, 

PUSHAD 

60 

Push All to GPR Words or Doublewords 

Redundant Grpl 

82/2 

Redundant encoding of groupl Eb,lb 
opcodes 

SALC 

D6 

Set AL According to CF 
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Table B-3 lists instructions that are reassigned to different functions in 64-bit mode. Attempted use of 
these instructions generates the reassigned function. 


Table B-3. Reassigned Instructions in 64-Bit Mode 


Mnemonic 

Opcode 

(hex) 

Description 

ARPL 

63 

Opcode for MOVSXD instruction in 64-bit 
mode. In all other modes, this is the Adjust 
Requestor Privilege Level instruction opcode. 

DEC and INC 

40-4F 

REX prefixes in 64-bit mode. In all other 
modes, decrement by 1 and increment by 1. 

LDS 

C5 

VEX Prefix. Introduces the VEX two-byte 
instruction encoding escape sequence. 

LES 

C4 

VEX Prefix. Introduces the VEX three-byte 
instruction encoding escape sequence. 


Table B-4 lists instructions that are illegal in long mode. Attempted use of these instructions generates 
an invalid-opcode exception (#UD). 


Table B-4. Invalid Instructions in Long Mode 


Mnemonic 

Opcode 

(hex) 

Description 

SYSENTER 

OF 34 

System Call 

SYSEXIT 

OF 35 

System Return 


B.4 Instructions with 64-Bit Default Operand Size 

In 64-bit mode, two groups of instructions default to 64-bit operand size without the need for a REX 
prefix: 

• Near branches —CALL, Jcc, JrCX, JMP, LOOP, and RET. 

• All instructions, except far branches, that implicitly reference the RSP —CALL, ENTER, LEAVE, 
POP, PUSH, and RET (CALL and RET are in both groups of instructions). 

Table B-5 lists these instructions. 
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Table B-5. Instructions Defaulting to 64-Bit Operand Size 


Mnemonic 

Opcode 

(hex) 

Implicitly 

Reference 

RSP 

Description 

CALL 

E8, FF 12 

yes 

Call Procedure Near 

ENTER 

C8 

yes 

Create Procedure Stack Frame 

Jcc 

many 

no 

Jump Conditional Near 

JMP 

E9, EB, FF 14 

no 

Jump Near 

LEAVE 

C9 

yes 

Delete Procedure Stack Frame 

LOOP 

E2 

no 

Loop 

LOOPcc 

EO, El 

no 

Loop Conditional 

POP reg/mem 

8F/0 

yes 

Pop Stack (register or memory) 

POP reg 

58-5F 

yes 

Pop Stack (register) 

POP FS 

OF A1 

yes 

Pop Stack into FS Segment Register 

POP GS 

OF A9 

yes 

Pop Stack into GS Segment Register 

POPF, POPFD, POPFQ 

9D 

yes 

Pop to rFLAGS Word, Doubleword, or Quadword 

PUSH imm8 

6A 

yes 

Push onto Stack (sign-extended byte) 

PUSH imm32 

68 

yes 

Push onto Stack (sign-extended doubleword) 

PUSH reg/mem 

FF/6 

yes 

Push onto Stack (register or memory) 

PUSH reg 

50-57 

yes 

Push onto Stack (register) 

PUSH FS 

OF AO 

yes 

Push FS Segment Register onto Stack 

PUSH GS 

OF A8 

yes 

Push GS Segment Register onto Stack 

PUSHF, PUSHFD, 

PUSHFQ 

9C 

yes 

Push rFLAGS Word, Doubleword, or Quadword 
onto Stack 

RET 

C2, C3 

yes 

Return From Call (near) 


The 64-bit default operand size can be overridden to 16 bits using the 66h operand-size override. 
However, it is not possible to override the operand size to 32 bits because there is no 32-bit operand- 
size override prefix for 64-bit mode. See “Operand-Size Override Prefix” on page 7 for details. 


B.5 Single-Byte INC and DEC Instructions in 64-Bit Mode 

In 64-bit mode, the legacy encodings for the 16 single-byte INC and DEC instructions (one for each of 
the eight GPRs) are used to encode the REX prefix values, as described in “REX Prefix” on page 14. 
Therefore, these single-byte opcodes for INC and DEC are not available in 64-bit mode, although they 
are available in legacy and compatibility modes. The functionality of these INC and DEC instructions 
is still available in 64-bit mode, however, using the ModRM forms of those instructions (opcodes FF/0 
and FF/1). 


General-Purpose Instructions in 64-Bit Mode 


533 




AMpg 

AMD64 Technology 


24594 — Rev. 3.27—September 2019 


B.6 NOP in 64-Bit Mode 

Programs written for the legacy x86 architecture commonly use opcode 90h (the XCHG EAX, EAX 
instruction) as a one-byte NOP. In 64-bit mode, the processor treats opcode 90h specially in order to 
preserve this legacy NOP use. Without special handling in 64-bit mode, the instruction would not be a 
true no-operation. Therefore, in 64-bit mode the processor treats XCHG EAX, EAX as a true NOP, 
regardless of operand size. 

This special handling does not apply to the two-byte ModRM form of the XCHG instruction. Unless a 
64-bit operand size is specified using a REX prefix byte, using the two byte form of XCHG to 
exchange a register with itself will not result in a no-operation because the default operation size is 32 
bits in 64-bit mode. 

B.7 Segment Override Prefixes in 64-Bit Mode 

In 64-bit mode, the CS, DS, ES, SS segment-override prefixes have no effect. These four prefixes are 
no longer treated as segment-override prefixes in the context of multiple-prefix rules. Instead, they are 
treated as null prefixes. 

The FS and GS segment-override prefixes are treated as true segment-override prefixes in 64-bit 
mode. Use of the FS and GS prefixes cause their respective segment bases to be added to the effective 
address calculation. See “FS and GS Registers in 64-Bit Mode” in Volume 2 for details. 
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Appendix C Differences Between Long Mode and 

Legacy Mode 


Table C-l summarizes the major differences between 64-bit mode and legacy protected mode. The 
third column indicates differences between 64-bit mode and legacy mode. The fourth column indicates 
whether that difference also applies to compatibility mode. 


Table C-1. Differences Between Long Mode and Legacy Mode 


Type 

Subject 

64-Bit Mode Difference 

Applies To 
Compatibility 
Mode? 


Addressing 

RIP-relative addressing available 




Default data size is 32 bits 



Data and Address 

REX Prefix toggles data size to 64 bits 



Sizes 

Default address size is 64 bits 

no 



Address size prefix toggles address size to 32 bits 


Application 

Programming 


Various opcodes are invalid or changed in 64-bit 
mode (see Table B-2 on page 531 and Table B-3 on 
page 532) 



Various opcodes are invalid in long mode (see 

Table B-4 on page 532) 

yes 


Instruction 

Differences 

MOV reg,imm32 becomes MOV reg,imm64 (with 

REX operand size prefix) 



REX is always enabled 




Direct-offset forms of MOV to or from accumulator 
become 64-bit offsets 

no 



MOVD extended to MOV 64 bits between MMX 
registers and long GPRs (with REX operand-size 
prefix) 
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Table C-1. Differences Between Long Mode and Legacy Mode (continued) 


Type 

Subject 

64-Bit Mode Difference 

Applies To 
Compatibility 
Mode? 

System 

Programming 

x86 Modes 

Real and virtual-8086 modes not supported 

yes 

Task Switching 

Task switching not supported 

yes 

Addressing 

64-bit virtual addresses 

yes 

4-level paging structures 

PAE must always be enabled 

Segmentation 

CS, DS, ES, SS segment bases are ignored 

no 

CS, DS, ES, FS, GS, SS segment limits are ignored 

CS, DS, ES, SS Segment prefixes are ignored 

Exception and 
Interrupt Handling 

All pushes are 8 bytes 

yes 

16-bit interrupt and trap gates are illegal 

32-bit interrupt and trap gates are redefined as 64-bit 
gates and are expanded to 16 bytes 

SS is set to null on stack switch 

SS:RSP is pushed unconditionally 

Call Gates 

All pushes are 8 bytes 

yes 

16-bit call gates are illegal 

32-bit call gate type is redefined as 64-bit call gate 
and is expanded to 16 bytes. 

SS is set to null on stack switch 

System-Descriptor 

Registers 

GDT, IDT, LDT, TR base registers expanded to 64 
bits 

yes 

System-Descriptor 
Table Entries and 
Pseudo-descriptors 

LGDT and LIDT use expanded 10-byte pseudo¬ 
descriptors. 

no 

LLDT and LTR use expanded 16-byte table entries. 
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Appendix D Instruction Subsets and CPUID 

Feature Flags 


This appendix provides information that can be used to detennine if a specific instruction within the 
AMD64 instruction-set architecture (ISA) is supported on a processor. 

Originally the x86 ISA was composed of a set of instructions from the general-purpose and system 
instruction groups. This set forms the base of the AMD64 ISA. As the ISA expanded over time, new 
instructions were added. Each addition constituted either a single instruction or a set of instructions 
and each addition was assigned a specific processor feature flag. 

Although most current processor products support the entire ISA, support for each added instruction or 
instruction subset is optional and must be confirmed by testing the corresponding feature flag. The 
presence of a particular instruction or subset is indicated by the corresponding feature flag being set. A 
feature flag is a single bit value located at a specific bit position within the 32-bit value returned in a 
register as a result of executing the CPUID instruction. 

For more information on using the CPUID instruction, see the instruction reference page for CPUID 
on page 160. For a comprehensive list of processor feature flags accessed using the CPUID 
instruction, see Appendix E, “Obtaining Processor Infonnation Via the CPUID Instruction” on 
page 607. 
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D.1 Instruction Set Overview 

The AMD64 ISA can be organized into five instruction groups: 

1. General-purpose instructions 

These instructions operate on the general-purpose registers (GP registers) and can be used at all 
privilege levels. This group includes instructions to load and store the contents of a GP register to 
and from memory, move values between the GP registers, and perfonn arithmetic and logical 
operations on the contents of the registers. 

2. System instructions 

These instructions provide the means to manipulate the processor operating mode, access 
processor resources, handle program and system errors, and manage system memory. Many of 
these instructions require privilege level 0 to execute. 

3. x87 instructions 

These instructions are available at all privilege levels and include legacy floating-point 
instructions that use the ST(0)-ST(7) stack registers (FPR0-FPR7 physical registers) and 
internally use extended precision (80-bit) binary floating-point representation and operations. 

4. 64-bit media Instructions 

These instructions are available at all privilege levels and perfonn vector operations on packed 
integer and floating-point values held in the 64-bit MMX™ registers. The MMX register set 
overlays the FPR0-FPR7 physical registers. This group is composed of the MMX and 3DNow!™ 
instruction subsets and was subsequently expanded by the MMX and 3DNow! extensions subsets. 

5. SSE instructions 

The SSE instructions operate on packed integer and floating-point values held in the XMM / YMM 
registers. SSE includes the original Streaming SIMD Extensions, all the subsequent named SSE 
subsets, and the AVX, XOP, and AES instructions. 

Figure D-l on page 539 represents the relationship between the five major instruction groups and the 
named instruction subsets. Circles represent the instruction subsets. These include the base instruction 
set labeled “Base Instructions” in the diagram and the named subsets. The diagram omits individual 
optional instructions and some of the minor named instruction subsets. Dashed-line polygons 
represent the instruction groups. 

Note that the 128-bit and 256-bit media instructions are referred to collectively as the Streaming SIMD 
Extensions (SSE). This is also the name of the original SSE subset. In the diagram the original SSE 
subset is labeled “SSE1 Instructions.” Collectively the 64-bit media and the SSE instructions make up 
the single instruction / multiple data (SIMD) group (labeled “SIMD Instructions” in the diagram). 

The overlapping of the SSE and 64-bit media instruction subsets indicates that these subsets share 
some common mnemonics. However, these common mnemonics either have distinct opcodes for each 
subset or they take operands in both the MMX and XMM register sets. 

The horizontal axis of Figure D-l shows how the subsets have evolved over time. 
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Figure D-1. AMD64 ISA Instruction Subsets 
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D.2 CPUID Feature Flags Related to Instruction Support 

Only a subset of the CPUID feature flags provides information related to instruction support. 

The feature flags related to supported instruction subsets are accessed via the standard function 
number OOOOOOOlh, the extended function number 8000_00001h, and the structured extended 
function number 0000_0007h. 

The following table lists all flags related to instruction support. Entries for each flag provide the 
instruction or instruction subset corresponding to the flag, the CPUID function that must be executed 
to access the flag, and the bit position of the flag in the return value. The feature flags listed are used in 
Table D-2 on page 543: 

Table D-1. Feature Flags for Instruction / Instruction Subset Support 


Feature Flag 

Instruction or Subset 

CPUID 

Function 1 

Feature Flag 
Bit Position 2 

BASE 

Base Instruction set 

— 

— 

CLFSH 

CLFLUSH 

standard 

EDX[19] 

CMPXCHG8B 

CMPXCHG8B 

both 

EDX[8] 

CMPXCHG16B 

CMPXCHG16B 

standard 

ECX[13] 

CMOV 

CMOVcc 

both 

EDX[15] 

MSR 

RDMSR/WRMSR 

both 

EDX[5] 

TSC 

RDTSC / RDTSCP 

both 

EDX[4] 

RDTSCP 

RDTSCP 

extended 

EDX[27] 

SysCallSysRet 

SYSCALL / SYSRET 

extended 

EDX[11] 

SysEnterSysExit 

SYSENTER / SYSEXIT 

standard 

EDX[11] 

FPU 

x87 

both 

EDX[0] 

x87 && CMOV 

FCMOVcc 3 

both 

EDX[0] && 
EDX[15] 

MMX 

MMX 

both 

EDX[23] 

3DNow 

3DNow! 

extended 

EDX[31] 

MmxExt 

MMX Extensions 

extended 

EDX[22] 

3DNowExt 

3DNow! Extensions 

extended 

EDX[30] 

3DNowPrefetch 

PREFETCH/ 

PREFETCHW 

extended 

ECX[8] 

SSE 

SSE1 

standard 

EDX[25] 

SSE2 

SSE2 

standard 

EDX[26] 

Notes: 

1. standard = Fn 0000_0001 h; extended = Fn 8000_0001 h; both means that both 
standard and extended CPUID functions return the same feature flag in the same 
bit position of the return value. For functions of the form xxxx_xxxx_x, the trailing 
digit is the value required in ECX. 

2. Register and bit position of the return value that corresponds to the feature flag. 

3. FCMOVcc instruction is supported if x87 and CMOVcc instructions are both sup¬ 
ported. 

4. XSAVE (and related) instructions require separate enablement. 
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Table D-1. Feature Flags for Instruction / Instruction Subset Support 


Feature Flag 

Instruction or Subset 

CPUID 

Function 1 

Feature Flag 
Bit Position 2 

SSE3 

SSE3 

standard 

ECX[0] 

SSSE3 

SSSE3 

standard 

ECX[9] 

SSE4A 

SSE4A 

extended 

ECX[6] 

SSE41 

SSE4.1 

standard 

ECX[19] 

SSE42 

SSE4.2 

standard 

ECX[20] 

LM 

Long Mode 

extended 

EDX[29] 

SVM 

Secure Virtual Machine 

extended 

ECX[2] 

AVX 

AVX 

standard 

ECX[28] 

AVX2 

AVX2 

0000_0007_0 

EBX[5] 

XOP 

XOP 

extended 

ECX[11] 

AES 

AES 

standard 

ECX[25] 

FMA 

FMA 

standard 

ECX[12] 

FMA4 

FMA4 

extended 

ECX[16] 

F16C 

16-bit floating-point 
conversion 

standard 

ECX[29] 

RDRAND 

RDRAND 

standard 

ECX[30] 

ABM 

LZCNT 

extended 

ECX[5] 

BMI1 

Bit Manipulation, group 1 

0000_0007_0 

EBX[3] 

BMI2 

Bit Manipulation, group 2 

0000_0007_0 

EBX[8] 

POPCNT 

POPCNT 

standard 

ECX[23] 

TBM 

Trailing bit manipulation 

extended 

ECX[21] 

MOVBE 

MOVBE 

standard 

ECX[22] 

MONITOR 

MONITOR / MWAIT 

standard 

ECX[3] 

MONITORX 

MONITORX / MWAITX 

extended 

ECX[29] 

PCLMULQDQ 

PCLMULQDQ 

standard 

ECX[1] 

FXSR 

FXSAVE/FXRSTOR 

both 

EDX[24] 

SKINIT 

SKINIT/STGI 

extended 

ECX[12] 

LahfSahf 

LAHF/SAHF 

extended 

ECX[0] 

FSGSBASE 

FS and GS base read 

and write 

0000_0007_0 

EBX[0] 

SHA 

SHA 

0000_0007_0 

EBX[29] 

CLFLOPT 

CLFLOPT 

0000_0007_0 

EBX[23] 

SMAP 

SMAP 

0000_0007_0 

EBX[20] 

ADX 

ADX 

0000_0007_0 

EBX[19] 

Notes: 

1. standard = Fn 0000_0001 h; extended = Fn 8000_0001 h; both means that both 
standard and extended CPUID functions return the same feature flag in the same 
bit position of the return value. For functions of the form xxxx_xxxx_x, the trailing 
digit is the value required in ECX. 

2. Register and bit position of the return value that corresponds to the feature flag. 

3. FCMOVcc instruction is supported if x87 and CMOVcc instructions are both sup¬ 
ported. 

4. XSAVE (and related) instructions require separate enablement. 
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Table D-1. Feature Flags for Instruction / Instruction Subset Support 


Feature Flag 

Instruction or Subset 

CPUID 

Function 1 

Feature Flag 
Bit Position 2 

RDSEED 

RDSEED 

0000_0007_0 

EBX[18] 

SME 

SME 

8000_001F 

EAX[0] 

SEV 

SEV 

8000_001F 

EAX[1] 

PageFlushMsr 

PageFlushMsr 

8000_001F 

EAX[2] 

ES 

ES 

8000_001F 

EAX[3] 

CLZERO 

CLZERO 

8000_0008 

EBX[0] 

Instruction Retired 

Counter 

Instruction Retired 

Counter 

8000_0008 

EBX[1] 

Error Pointer 

Zero/Restore 

Error Pointer 

Zero/Restore 

8000_0008 

EBX[2] 

XSAVEOPT 

XSAVEOPT 

0000_000D_1 

EAX[0] 

XSAVEC 

XSAVEC 

0000_000D_1 

EAX[1] 

XGETBVw/ ECX=1 

XGETBV w/ ECX=1 

0000_000D_1 

EAX[2] 

XSAVES/XRSTORS 

XSAVES/XRSTORS 

0000_000D_1 

EAX[3] 

XSAVE 

XSAVE / XRSTOR 4 

standard 

ECX[26] 

Notes: 

1. standard = Fn 0000_0001 h; extended = Fn 8000_0001 h; both means that both 
standard and extended CPUID functions return the same feature flag in the same 
bit position of the return value. For functions of the form xxxx_xxxx_x, the trailing 
digit is the value required in ECX. 

2. Register and bit position of the return value that corresponds to the feature flag. 

3. FCMOVcc instruction is supported if x87 and CMOVcc instructions are both sup¬ 
ported. 

4. XSAVE (and related) instructions require separate enablement. 


D.3 Instruction List 

Table D-2 shows the minimum current privilege level (CPL) required to execute each instruction and 
the feature flag or flags that indicates support for that instruction. Each flag is listed in the column 
corresponding to the instruction group to which it belongs. Note that some instructions span groups. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

AAA 

ASCII Adjust After Addition 

3 

Base 





AAD 

ASCII Adjust Before 

Division 

3 

Base 





AAM 

ASCII Adjust After Multiply 

3 

Base 





AAS 

ASCII Adjust After 
Subtraction 

3 

Base 





ADC 

Add with Carry 

3 

Base 





ADD 

Signed or Unsigned Add 

3 

Base 





ADDPD 

Add Packed Double- 
Precision Floating-Point 

3 


SSE2 




ADDPS 

Add Packed Single- 
Precision Floating-Point 

3 


SSE 




ADDSD 

Add Scalar Double- 
Precision Floating-Point 

3 


SSE2 




ADDSS 

Add Scalar Single- 
Precision Floating-Point 

3 


SSE 




ADDSUBPD 

Add and Subtract Double- 

Precision 

3 


SSE3 




ADDSUBPS 

Add and Subtract Single- 
Precision 

3 


SSE3 




AESDEC 

AES Decryption Round 

3 


AES 




AESDECLAST 

AES Last Decryption 

Round 

3 


AES 




AESENC 

AES Encryption Round 

3 


AES 




AESENCLAST 

AES Last Encryption 

Round 

3 


AES 




AESIMC 

AES InvMixColumn 

Transformation 

3 


AES 




AESKEYGENASSIST 

AES Assist Round Key 
Generation 

3 


AES 




AND 

Logical AND 

3 

Base 





ANDN 

Logical And-Not 

3 

BMI1 





Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

ANDNPD 

AND NOT Packed Double- 
Precision Floating-Point 

3 


SSE2 




ANDNPS 

AND NOT Packed Single- 
Precision Floating-Point 

3 


SSE 




ANDPD 

AND Packed Double- 
Precision Floating-Point 

3 


SSE2 




ANDPS 

AND Packed Single- 
Precision Floating-Point 

3 


SSE 




ARPL 

Adjust Requestor Privilege 
Level 

3 





Base 

BEXTR 

(immediate form) 

Bit Field Extract 

3 

TBM 





BEXTR 
(register form) 

Bit Field Extract 

3 

BMI1 





BLCFILL 

Fill From Lowest Clear Bit 

3 

TBM 





BLCI 

Isolate Lowest Clear Bit 

3 

TBM 





BLSI 

Isolate Lowest Set Bit 

3 

BMI1 





BLCIC 

Isolate Lowest Clear Bit 
and Complement 

3 

TBM 





BLCMSK 

Mask From Lowest Clear 

Bit 

3 

TBM 





BLCS 

Set Lowest Clear Bit 

3 

TBM 





BLENDPD 

Blend Packed Double- 
Precision Floating-Point 

3 


SSE41 




BLENDPS 

Blend Packed Single- 
Precision Floating-Point 

3 


SSE41 




BLENDVPD 

Variable Blend Packed 
Double-Precision Floating- 
Point 

3 


SSE41 




BLENDVPS 

Variable Blend Packed 
Single-Precision Floating- 
Point 

3 


SSE41 




BLSMSK 

Mask From Lowest Set Bit 

3 

BMI1 





BLSFILL 

Fill From Lowest Set Bit 

3 

TBM 





Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

BLSIC 

Isolate Lowest Set Bit and 
Complement 

3 

TBM 





BLSR 

Reset Lowest Set Bit 

3 

BMI1 





BOUND 

Check Array Bounds 

3 

Base 





BSF 

Bit Scan Forward 

3 

Base 





BSR 

Bit Scan Reverse 

3 

Base 





BSWAP 

Byte Swap 

3 

Base 





BT 

Bit Test 

3 

Base 





BTC 

Bit Test and Complement 

3 

Base 





BTR 

Bit Test and Reset 

3 

Base 





BTS 

Bit Test and Set 

3 

Base 





BZHI 

Zero High Bits 

3 

BMI2 





CALL 

Procedure Call 

3 

Base 





CBW 

Convert Byte to Word 

3 

Base 





CDC 

Convert Doubleword to 

Guadword 

3 

Base 





CDOE 

Convert Doubleword to 

Guadword 

3 

LM 





CLC 

Clear Carry Flag 

3 

Base 





CLD 

Clear Direction Flag 

3 

Base 





CLFLUSH 

Cache Line Flush 

3 

CLFSH 





CLGI 

Clear Global Interrupt Flag 

0 





SVM 

CLI 

Clear Interrupt Flag 

3 





Base 

CLTS 

Clear Task-Switched Flag 
in CRO 

0 





Base 

CMC 

Complement Carry Flag 

3 

Base 





CMOVcc 

Conditional Move 

3 

CMOV 





CMP 

Compare 

3 

Base 





CMPPD 

Compare Packed Double- 
Precision Floating-Point 

3 


SSE2 




CMPPS 

Compare Packed Single- 
Precision Floating-Point 

3 


SSE 




CMPS 

Compare Strings 

3 

Base 





Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

CMPSB 

Compare Strings by Byte 

3 

Base 





CMPSD 

Compare Strings by 
Doubleword 

3 

Base 2 





CMPSD 

Compare Scalar Double- 
Precision Floating-Point 

3 


SSE2 2 




CMPSQ 

Compare Strings by 
Quadword 

3 

LM 





CMPSS 

Compare Scalar Single- 
Precision Floating-Point 

3 


SSE 




CMPSW 

Compare Strings by Word 

3 

Base 





CMPXCHG 

Compare and Exchange 

3 

Base 





CMPXCHG8B 

Compare and Exchange 
Eight Bytes 

3 

CMPXCHG8B 





CMPXCHG16B 

Compare and Exchange 
Sixteen Bytes 

3 

CMPXCHG16B 





COMISD 

Compare Ordered Scalar 
Double-Precision Floating- 
Point 

3 


SSE2 




COMISS 

Compare Ordered Scalar 
Single-Precision Floating- 
Point 

3 


SSE 




CPUID 

Processor Identification 

3 

Base 





CQO 

Convert Quadword to 
Double Quadword 

3 

LM 





CRC32 

32-bit Cyclical Redundancy 
Check 

3 

SSE42 





CVTDQ2PD 

Convert Packed 

Doubleword Integers to 
Packed Double-Precision 
Floating-Point 

3 


SSE2 




CVTDQ2PS 

Convert Packed 

Doubleword Integers to 
Packed Single-Precision 
Floating-Point 

3 


SSE2 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

CVTPD2DQ 

Convert Packed Double- 
Precision Floating-Point to 
Packed Doubleword 
Integers 

3 


SSE2 




CVTPD2PI 

Convert Packed Double- 
Precision Floating-Point to 
Packed Doubleword 
Integers 

3 


SSE2 

SSE2 



CVTPD2PS 

Convert Packed Double- 
Precision Floating-Point to 
Packed Single-Precision 
Floating-Point 

3 


SSE2 




CVTPI2PD 

Convert Packed 

Doubleword Integers to 
Packed Double-Precision 
Floating-Point 

3 


SSE2 

SSE2 



CVTPI2PS 

Convert Packed 

Doubleword Integers to 
Packed Single-Precision 
Floating-Point 

3 


SSE 

SSE 



CVTPS2DQ 

Convert Packed Single- 
Precision Floating-Point to 
Packed Doubleword 
Integers 

3 


SSE2 




CVTPS2PD 

Convert Packed Single- 
Precision Floating-Point to 
Packed Double-Precision 
Floating-Point 

3 


SSE2 




CVTPS2PI 

Convert Packed Single- 
Precision Floating-Point to 
Packed Doubleword 
Integers 

3 


SSE 

SSE 



CVTSD2SI 

Convert Scalar Double- 
Precision Floating-Point to 
Signed Doubleword or 
Quadword Integer 

3 


SSE2 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

CVTSD2SS 

Convert Scalar Double- 
Precision Floating-Point to 
Scalar Single-Precision 
Floating-Point 

3 


SSE2 




CVTSI2SD 

Convert Signed 

Doubleword or Quadword 
Integer to Scalar Double- 
Precision Floating-Point 

3 


SSE2 




CVTSI2SS 

Convert Signed 

Doubleword or Quadword 
Integer to Scalar Single- 
Precision Floating-Point 

3 


SSE 




CVTSS2SD 

Convert Scalar Single- 
Precision Floating-Point to 
Scalar Double-Precision 
Floating-Point 

3 


SSE2 




CVTSS2SI 

Convert Scalar Single- 
Precision Floating-Point to 
Signed Doubleword or 
Quadword Integer 

3 


SSE 




CVTTPD2DQ 

Convert Packed Double- 
Precision Floating-Point to 
Packed Doubleword 
Integers, Truncated 

3 


SSE2 




CVTTPD2PI 

Convert Packed Double- 
Precision Floating-Point to 
Packed Doubleword 
Integers, Truncated 

3 


SSE2 

SSE2 



CVTTPS2DQ 

Convert Packed Single- 
Precision Floating-Point to 
Packed Doubleword 
Integers, Truncated 

3 


SSE2 




CVTTPS2PI 

Convert Packed Single- 
Precision Floating-Point to 
Packed Doubleword 
Integers, Truncated 

3 


SSE 

SSE 



Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

CVTTSD2SI 

Convert Scalar Double- 
Precision Floating-Point to 
Signed Doubleword or 
Quadword Integer, 
Truncated 

3 


SSE2 




CVTTSS2SI 

Convert Scalar Single- 
Precision Floating-Point to 
Signed Doubleword or 
Quadword Integer, 
Truncated 

3 


SSE 




CWD 

Convert Word to 

Doubleword 

3 

Base 





CWDE 

Convert Word to 

Doubleword 

3 

Base 





DAA 

Decimal Adjust after 
Addition 

3 

Base 





DAS 

Decimal Adjust after 
Subtraction 

3 

Base 





DEC 

Decrement by 1 

3 

Base 





DIV 

Unsigned Divide 

3 

Base 





DIVPD 

Divide Packed Double- 
Precision Floating-Point 

3 


SSE2 




DIVPS 

Divide Packed Single- 
Precision Floating-Point 

3 


SSE 




DIVSD 

Divide Scalar Double- 
Precision Floating-Point 

3 


SSE2 




DIVSS 

Divide Scalar Single- 
Precision Floating-Point 

3 


SSE 




DPPD 

Dot Product Packed 
Double-Precision Floating- 
Point 

3 


SSE41 




DPPS 

Dot Product Packed 
Single-Precision Floating- 
Point 

3 


SSE41 




EMMS 

Enter/Exit Multimedia State 

3 



MMX 

MMX 


Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

ENTER 

Create Procedure Stack 

Frame 

3 

Base 





EXTRACTPS 

Extract Packed Single- 
Precision Floating-Point 

3 


SSE41 




EXTRQ 

Extract Field From Register 

3 


SSE4A 




F2XM1 

Floating-Point Compute 
2x-1 

3 




X87 


FABS 

Floating-Point Absolute 
Value 

3 




X87 


FADD 

Floating-Point Add 

3 




X87 


FAD DP 

Floating-Point Add and 

Pop 

3 




X87 


FBLD 

Floating-Point Load Binary- 
Coded Decimal 

3 




X87 


FBSTP 

Floating-Point Store 
Binary-Coded Decimal 
Integer and Pop 

3 




X87 


FCHS 

Floating-Point Change 

Sign 

3 




X87 


FCLEX 

Floating-Point Clear Flags 

3 




X87 


FCMOVB 

Floating-Point Conditional 
Move If Below 

3 




X87 && 
CMOV 


FCMOVBE 

Floating-Point Conditional 
Move If Below or Equal 

3 




X87 && 
CMOV 


FCMOVE 

Floating-Point Conditional 
Move If Equal 

3 




X87 && 
CMOV 


FCMOVNB 

Floating-Point Conditional 
Move If Not Below 

3 




X87 && 
CMOV 


FCMOVNBE 

Floating-Point Conditional 
Move If Not Below or Equal 

3 




X87 && 
CMOV 


FCMOVNE 

Floating-Point Conditional 
Move If Not Equal 

3 




X87 && 
CMOV 


FCMOVNU 

Floating-Point Conditional 
Move If Not Unordered 

3 




X87 && 
CMOV 


Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

FCMOVU 

Floating-Point Conditional 
Move If Unordered 

3 




X87 && 
CMOV 


FCOM 

Floating-Point Compare 

3 




X87 


FCOMI 

Floating-Point Compare 
and Set Flags 

3 




X87 


FCOMIP 

Floating-Point Compare 
and Set Flags and Pop 

3 




X87 


FCOMP 

Floating-Point Compare 
and Pop 

3 




X87 


FCOMPP 

Floating-Point Compare 
and Pop Twice 

3 




X87 


FCOS 

Floating-Point Cosine 

3 




X87 


FDECSTP 

Floating-Point Decrement 
Stack-Top Pointer 

3 




X87 


FDIV 

Floating-Point Divide 

3 




X87 


FDIVP 

Floating-Point Divide and 
Pop 

3 




X87 


FDIVR 

Floating-Point Divide 
Reverse 

3 




X87 


FDIVRP 

Floating-Point Divide 
Reverse and Pop 

3 




X87 


FEMMS 

Fast Enter/Exit Multimedia 

State 

3 



3DNow 

3DNow 


FFREE 

Free Floating-Point 

Register 

3 




X87 


FI ADD 

Floating-Point Add Integer 
to Stack Top 

3 




X87 


FICOM 

Floating-Point Integer 
Compare 

3 




X87 


FICOMP 

Floating-Point Integer 
Compare and Pop 

3 




X87 


FIDIV 

Floating-Point Integer 

Divide 

3 




X87 


FIDIVR 

Floating-Point Integer 

Divide Reverse 

3 




X87 


Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

FILD 

Floating-Point Load Integer 

3 




X87 


FIMUL 

Floating-Point Integer 
Multiply 

3 




X87 


FINCSTP 

Floating-Point Increment 
Stack-Top Pointer 

3 




X87 


FINIT 

Floating-Point Initialize 

3 




X87 


FIST 

Floating-Point Integer 

Store 

3 




X87 


FISTP 

Floating-Point Integer 

Store and Pop 

3 




X87 


FISTTP 

Floating-Point Integer 
Truncate and Store 

3 




SSE3 


FISUB 

Floating-Point Integer 
Subtract 

3 




X87 


FISUBR 

Floating-Point Integer 
Subtract Reverse 

3 




X87 


FLD 

Floating-Point Load 

3 




X87 


FLD1 

Floating-Point Load +1.0 

3 




X87 


FLDCW 

Floating-Point Load x87 
Control Word 

3 




X87 


FLDENV 

Floating-Point Load x87 
Environment 

3 




X87 


FLDL2E 

Floating-Point Load 

Log 2 e 

3 




X87 


FLDL2T 

Floating-Point Load 

Log 2 10 

3 




X87 


FLDLG2 

Floating-Point Load 

Logm 2 

3 




X87 


FLDLN2 

Floating-Point Load Ln 2 

3 




X87 


FLDPI 

Floating-Point Load Pi 

3 




X87 


FLDZ 

Floating-Point Load +0.0 

3 




X87 


FMUL 

Floating-Point Multiply 

3 




X87 


FMULP 

Floating-Point Multiply and 
Pop 

3 




X87 


Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

FNCLEX 

Floating-Point No-Wait 
Clear Flags 

3 




X87 


FNINIT 

Floating-Point No-Wait 
Initialize 

3 




X87 


FNOP 

Floating-Point No 

Operation 

3 




X87 


FNSAVE 

Save No-Wait x87 and 

MMX State 

3 



X87 

X87 


FNSTCW 

Floating-Point No-Wait 

Store x87 Control Word 

3 




X87 


FNSTENV 

Floating-Point No-Wait 

Store x87 Environment 

3 




X87 


FNSTSW 

Floating-Point No-Wait 

Store x87 Status Word 

3 




X87 


FPATAN 

Floating-Point Partial 
Arctangent 

3 




X87 


FPREM 

Floating-Point Partial 
Remainder 

3 




X87 


FPREM1 

Floating-Point Partial 
Remainder 

3 




X87 


FPTAN 

Floating-Point Partial 
Tangent 

3 




X87 


FRNDINT 

Floating-Point Round to 
Integer 

3 




X87 


FRSTOR 

Restore x87 and MMX 

State 

3 



X87 

X87 


FSAVE 

Save x87 and MMX State 

3 



X87 

X87 


FSCALE 

Floating-Point Scale 

3 




X87 


FSIN 

Floating-Point Sine 

3 




X87 


FSINCOS 

Floating-Point Sine and 
Cosine 

3 




X87 


FSQRT 

Floating-Point Square Root 

3 




X87 


FST 

Floating-Point Store Stack 
Top 

3 




X87 


Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

FSTCW 

Floating-Point Store x87 
Control Word 

3 




X87 


FSTENV 

Floating-Point Store x87 
Environment 

3 




X87 


FSTP 

Floating-Point Store Stack 
Top and Pop 

3 




X87 


FSTSW 

Floating-Point Store x87 
Status Word 

3 




X87 


FSUB 

Floating-Point Subtract 

3 




X87 


FSUBP 

Floating-Point Subtract and 
Pop 

3 




X87 


FSUBR 

Floating-Point Subtract 
Reverse 

3 




X87 


FSUBRP 

Floating-Point Subtract 
Reverse and Pop 

3 




X87 


FTST 

Floating-Point Test with 

Zero 

3 




X87 


FUCOM 

Floating-Point Unordered 
Compare 

3 




X87 


FUCOMI 

Floating-Point Unordered 
Compare and Set Flags 

3 




X87 


FUCOMIP 

Floating-Point Unordered 
Compare and Set Flags 
and Pop 

3 




X87 


FUCOMP 

Floating-Point Unordered 
Compare and Pop 

3 




X87 


FUCOMPP 

Floating-Point Unordered 
Compare and Pop Twice 

3 




X87 


FWAIT 

Wait for x87 Floating-Point 
Exceptions 

3 




X87 


FXAM 

Floating-Point Examine 

3 




X87 


FXCH 

Floating-Point Exchange 

3 




X87 


FXRSTOR 

Restore XMM, MMX, and 
x87 State 

3 


FXSR 

FXSR 

FXSR 


Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

FXSAVE 

Save XMM, MMX, and x87 
State 

3 


FXSR 

FXSR 

FXSR 


FXTRACT 

Floating-Point Extract 
Exponent and Significand 

3 




X87 


FYL2X 

Floating-Point y * log 2 x 

3 




X87 


FYL2XP1 

Floating-Point 
y * log 2 (x +1) 

3 




X87 


HADDPD 

Horizontal Add Packed 

Double 

3 


SSE3 




HADDPS 

Horizontal Add Packed 
Single 

3 


SSE3 




HLT 

Halt 

0 





Base 

HSUBPD 

Horizontal Subtract Packed 

Double 

3 


SSE3 




HSUBPS 

Horizontal Subtract Packed 
Single 

3 


SSE3 




IDIV 

Signed Divide 

3 

Base 





IMUL 

Signed Multiply 

3 

Base 





IN 

Input from Port 

3 

Base 





INC 

Increment by 1 

3 

Base 





INS 

Input String 

3 

Base 





INSB 

Input String Byte 

3 

Base 





INSD 

Input String Doubleword 

3 

Base 





INSERTPS 

Insert Packed Single- 
Precision Floating-Point 

3 


SSE41 




INSERTQ 

Insert Field 

3 


SSE4A 




INSW 

Input String Word 

3 

Base 





INT 

Interrupt to Vector 

3 

Base 





INT 3 

Interrupt to Debug Vector 

3 





Base 

INTO 

Interrupt to Overflow 

Vector 

3 

Base 





INVD 

Invalidate Caches 

0 





Base 

INVLPG 

Invalidate TLB Entry 

0 





Base 

Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

INVLPGA 

Invalidate TLB Entry in a 
Specified ASID 

0 





SVM 

IRET 

Interrupt Return Word 

3 





Base 

IRETD 

Interrupt Return 

Doubleword 

3 





Base 

IRETQ 

Interrupt Return Quadword 

3 





LM 

Jcc 

Jump Condition 

3 

Base 





JCXZ 

Jump if CX Zero 

3 

Base 





JECXZ 

Jump if ECX Zero 

3 

Base 





JMP 

Jump 

3 

Base 





JRCXZ 

Jump if RCX Zero 

3 

Base 





LAHF 

Load Status Flags into AH 
Register 

3 

LahfSahf 





LAR 

Load Access Rights Byte 

3 





Base 

LDDQU 

Load Unaligned Double 
Quadword 

3 


SSE3 




LDMXCSR 

Load MXCSR 
Control/Status Register 

3 


SSE 




LDS 

Load DS Far Pointer 

3 

Base 





LEA 

Load Effective Address 

3 

Base 





LEAVE 

Delete Procedure Stack 

Frame 

3 

Base 





LES 

Load ES Far Pointer 

3 

Base 





LFENCE 

Load Fence 

3 

SSE2 





LFS 

Load FS Far Pointer 

3 

Base 





LGDT 

Load Global Descriptor 
Table Register 

0 





Base 

LGS 

Load GS Far Pointer 

3 

Base 





LIDT 

Load Interrupt Descriptor 
Table Register 

0 





Base 

LLDT 

Load Local Descriptor 

Table Register 

0 





Base 

LMSW 

Load Machine Status Word 

0 





Base 

LODS 

Load String 

3 

Base 





Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

LODSB 

Load String Byte 

3 

Base 





LODSD 

Load String Doubleword 

3 

Base 





LODSQ 

Load String Quadword 

3 

LM 





LODSW 

Load String Word 

3 

Base 





LOOP 

Loop 

3 

Base 





LOOPE 

Loop if Equal 

3 

Base 





LOOPNE 

Loop if Not Equal 

3 

Base 





LOOPNZ 

Loop if Not Zero 

3 

Base 





LOOPZ 

Loop if Zero 

3 

Base 





LSL 

Load Segment Limit 

3 

Base 





LSS 

Load SS Segment Register 

3 

Base 





LTR 

Load Task Register 

0 





Base 

LZCNT 

Count Leading Zeros 

3 

ABM 





MASKMOVDQU 

Masked Move Double 
Quadword Unaligned 

3 


SSE2 




MASKMOVQ 

Masked Move Quadword 

3 



SSE || 
MmxExt 



MAXPD 

Maximum Packed Double- 
Precision Floating-Point 

3 


SSE2 




MAXPS 

Maximum Packed Single- 
Precision Floating-Point 

3 


SSE 




MAXSD 

Maximum Scalar Double- 
Precision Floating-Point 

3 


SSE2 




MAXSS 

Maximum Scalar Single- 
Precision Floating-Point 

3 


SSE 




MFENCE 

Memory Fence 

3 

SSE2 





MINPD 

Minimum Packed Double- 
Precision Floating-Point 

3 


SSE2 




MINPS 

Minimum Packed Single- 
Precision Floating-Point 

3 


SSE 




MINSD 

Minimum Scalar Double- 
Precision Floating-Point 

3 


SSE2 




MINSS 

Minimum Scalar Single- 
Precision Floating-Point 

3 


SSE 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

MONITOR 

Setup Monitor Address 3 

0 





MONITOR 

MONITORX 

Setup Monitor Address 

3 

MONITORX 





MOV 

Move 

3 

Base 





MOV CRn 

Move to/from Control 
Registers 

0 





Base 

MOV DRn 

Move to/from Debug 
Registers 

0 





Base 

MOVAPD 

Move Aligned Packed 
Double-Precision Floating- 
Point 

3 


SSE2 




MOVAPS 

Move Aligned Packed 
Single-Precision Floating- 
Point 

3 


SSE 




MOVBE 

Move Big Endian 

3 

MOVBE 





MOVD 

Move Doubleword or 

Quadword 

3 

MMX, SSE2 

SSE2 

MMX 



MOVDDUP 

Move Double-Precision 
and Duplicate 

3 


SSE3 




MOVDQ2Q 

Move Quadword to 
Quadword 

3 


SSE2 

SSE2 



MOVDQA 

Move Aligned Double 
Quadword 

3 


SSE2 




MOVDQU 

Move Unaligned Double 
Quadword 

3 


SSE2 




MOVHLPS 

Move Packed Single- 
Precision Floating-Point 
High to Low 

3 


SSE 




MOVHPD 

Move High Packed 
Double-Precision Floating- 
Point 

3 


SSE2 




MOVHPS 

Move High Packed Single- 
Precision Floating-Point 

3 


SSE 




MOVLHPS 

Move Packed Single- 
Precision Floating-Point 

Low to High 

3 


SSE 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

MOVLPD 

Move Low Packed Double- 
Precision Floating-Point 

3 


SSE2 




MOVLPS 

Move Low Packed Single- 
Precision Floating-Point 

3 


SSE 




MOVMSKPD 

Extract Packed Double- 
Precision Floating-Point 
Sign Mask 

3 

SSE2 

SSE2 




MOVMSKPS 

Extract Packed Single- 
Precision Floating-Point 
Sign Mask 

3 

SSE 

SSE 




MOVNTDQ 

Move Non-Temporal 

Double Quadword 

3 


SSE2 




MOVNTDQA 

Move Non-Temporal 

Double Quadword Aligned 

3 


SSE41 




MOVNTI 

Move Non-Temporal 
Doubleword or Quadword 

3 

SSE2 





MOVNTPD 

Move Non-Temporal 

Packed Double-Precision 
Floating-Point 

3 


SSE2 




MOVNTPS 

Move Non-Temporal 

Packed Single-Precision 
Floating-Point 

3 


SSE 




MOVNTSD 

Move Non-Temporal Scalar 
Double-Precision Floating- 
Point 

3 


SSE4A 




MOVNTSS 

Move Non-Temporal Scalar 
Single-Precision Floating- 
Point 

3 


SSE4A 




MOVNTQ 

Move Non-Temporal 
Quadword 

3 



SSE || 
MmxExt 



MOVQ 

Move Quadword 

3 


SSE2 

MMX 



MOVQ2DQ 

Move Quadword to 
Quadword 

3 


SSE2 

SSE2 



MOVS 

Move String 

3 

Base 





MOVSB 

Move String Byte 

3 

Base 





Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

MOVSD 

Move String Doubleword 

3 

Base 3 





MOVSD 

Move Scalar Double- 
Precision Floating-Point 

3 


SSE2 2 




MOVSHDUP 

Move Single-Precision 

High and Duplicate 

3 


SSE3 




MOVSLDUP 

Move Single-Precision Low 
and Duplicate 

3 


SSE3 




MOVSQ 

Move String Quadword 

3 

LM 





MOVSS 

Move Scalar Single- 
Precision Floating-Point 

3 


SSE 




MOVSW 

Move String Word 

3 

Base 





MOVSX 

Move with Sign-Extend 

3 

Base 





MOVSXD 

Move with Sign-Extend 
Doubleword 

3 

LM 





MOVUPD 

Move Unaligned Packed 
Double-Precision Floating- 
Point 

3 


SSE2 




MOVUPS 

Move Unaligned Packed 
Single-Precision Floating- 
Point 

3 


SSE 




MOVZX 

Move with Zero-Extend 

3 

Base 





MPSADBW 

Multiple Sum of Absolute 
Differences 

3 


SSE41 




MUL 

Multiply Unsigned 

3 

Base 





MULPD 

Multiply Packed Double- 
Precision Floating-Point 

3 


SSE2 




MULPS 

Multiply Packed Single- 
Precision Floating-Point 

3 


SSE 




MULSD 

Multiply Scalar Double- 
Precision Floating-Point 

3 


SSE2 




MULSS 

Multiply Scalar Single- 
Precision Floating-Point 

3 


SSE 




MULX 

Multiply Unsigned 

3 

BMI2 





MWAIT 

Monitor Wait 3 

0 





MONITOR 

Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

MWAITX 

Monitor Wait with Timeout 

3 

MONITORX 





NEG 

Two's Complement 
Negation 

3 

Base 





NOP 

No Operation 

3 

Base 





NOT 

One's Complement 
Negation 

3 

Base 





OR 

Logical OR 

3 

Base 





ORPD 

Logical Bitwise OR Packed 
Double-Precision Floating- 
Point 

3 


SSE2 




ORPS 

Logical Bitwise OR Packed 
Single-Precision Floating- 
Point 

3 


SSE 




OUT 

Output to Port 

3 

Base 





OUTS 

Output String 

3 

Base 





OUTSB 

Output String Byte 

3 

Base 





OUTSD 

Output String Doubleword 

3 

Base 





OUTSW 

Output String Word 

3 

Base 





PABSB 

Packed Absolute Value 
Signed Byte 

3 


SSSE3 




PABSD 

Packed Absolute Value 
Signed Doubleword 

3 


SSSE3 




PABSW 

Packed Absolute Value 
Signed Word 

3 


SSSE3 




PACKSSDW 

Pack with Saturation 

Signed Doubleword to 

Word 

3 


SSE2 

MMX 



PACKSSWB 

Pack with Saturation 

Signed Word to Byte 

3 


SSE2 

MMX 



PACKUSDW 

Pack with Unsigned 
Saturation Doubleword to 

Word 

3 


SSE41 




PACKUSWB 

Pack with Saturation 

Signed Word to Unsigned 
Byte 

3 


SSE2 

MMX 



Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

PADDB 

Packed Add Bytes 

3 


SSE2 

MMX 



PADDD 

Packed Add Doublewords 

3 


SSE2 

MMX 



PADDQ 

Packed Add Quadwords 

3 


SSE2 

SSE2 



PADDSB 

Packed Add Signed with 
Saturation Bytes 

3 


SSE2 

MMX 



PADDSW 

Packed Add Signed with 
Saturation Words 

3 


SSE2 

MMX 



PADDUSB 

Packed Add Unsigned with 
Saturation Bytes 

3 


SSE2 

MMX 



PADDUSW 

Packed Add Unsigned with 
Saturation Words 

3 


SSE2 

MMX 



PADDW 

Packed Add Words 

3 


SSE2 

MMX 



PALIGNR 

Packed Align Right 

3 


SSSE3 




PAND 

Packed Logical Bitwise 

AND 

3 


SSE2 

MMX 



PANDN 

Packed Logical Bitwise 

AND NOT 

3 


SSE2 

MMX 



PAUSE 

Pause 

3 

BASE 





PAVGB 

Packed Average Unsigned 
Bytes 

3 


SSE2 

SSE || 
MmxExt 



PAVGUSB 

Packed Average Unsigned 
Bytes 

3 



3DNow 



PAVGW 

Packed Average Unsigned 
Words 

3 


SSE2 

SSE || 
MmxExt 



PBLENDVB 

Variable Blend Packed 
Bytes 

3 


SSE41 




PBLENDW 

Blend Packed Words 

3 


SSE41 




PCLMULQDQ 

Carry-less Multiply 
Quadwords 

3 


CLMUL 




PCMPEQB 

Packed Compare Equal 
Bytes 

3 


SSE2 

MMX 



PCMPEQD 

Packed Compare Equal 
Doublewords 

3 


SSE2 

MMX 



Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

PCMPEQQ 

Packed Compare Equal 
Quadwords 

3 


SSE41 




PCMPEQW 

Packed Compare Equal 
Words 

3 


SSE2 

MMX 



PCMPESTRI 

Packed Compare Explicit 
Length Strings Return 

Index 

3 


SSE42 




PCMPESTRM 

Packed Compare Explicit 
Length Strings Return 

Mask 

3 


SSE42 




PCMPGTB 

Packed Compare Greater 
Than Signed Bytes 

3 


SSE2 

MMX 



PCMPGTD 

Packed Compare Greater 
Than Signed Doublewords 

3 


SSE2 

MMX 



PCMPGTQ 

Packed Compare Greater 
Than Signed Quadwords 

3 


SSE42 




PCMPGTW 

Packed Compare Greater 
Than Signed Words 

3 


SSE2 

MMX 



PCMPISTRI 

Packed Compare Implicit 
Length Strings Return 

Index 

3 


SSE42 




PCMPISTRM 

Packed Compare Implicit 
Length Strings Return 

Mask 

3 


SSE42 




PDEP 

Parallel Deposit Bits 

3 

BMI2 





PEXT 

Parallel Extract Bits 

3 

BMI2 





PEXTRB 

Extract Packed Byte 

3 


SSE41 




PEXTRD 

Extract Packed 

Doubleword 

3 


SSE41 




PEXTRQ 

Extract Packed Quadword 

3 


SSE41 




PEXTRW 

Packed Extract Word 

3 


SSE41 || 
SSE2 

SSE || 
MmxExt 



PF2ID 

Packed Floating-Point to 
Integer Doubleword 
Conversion 

3 



3DNow 



Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

PF2IW 

Packed Floating-Point to 
Integer Word Conversion 

3 



3DNowExt 



PFACC 

Packed Floating-Point 
Accumulate 

3 



3DNow 



PFADD 

Packed Floating-Point Add 

3 



3DNow 



PFCMPEQ 

Packed Floating-Point 
Compare Equal 

3 



3DNow 



PFCMPGE 

Packed Floating-Point 
Compare Greater or Equal 

3 



3DNow 



PFCMPGT 

Packed Floating-Point 
Compare Greater Than 

3 



3DNow 



PFMAX 

Packed Floating-Point 
Maximum 

3 



3DNow 



PFMIN 

Packed Floating-Point 
Minimum 

3 



3DNow 



PFMUL 

Packed Floating-Point 
Multiply 

3 



3DNow 



PFNACC 

Packed Floating-Point 
Negative Accumulate 

3 



3DNowExt 



PFPNACC 

Packed Floating-Point 

Positive-Negative 

Accumulate 

3 



3DNowExt 



PFRCP 

Packed Floating-Point 
Reciprocal Approximation 

3 



3DNow 



PFRCPIT1 

Packed Floating-Point 
Reciprocal, Iteration 1 

3 



3DNow 



PFRCPIT2 

Packed Floating-Point 
Reciprocal or Reciprocal 
Square Root, Iteration 2 

3 



3DNow 



PFRSQIT1 

Packed Floating-Point 
Reciprocal Square Root, 
Iteration 1 

3 



3DNow 



PFRSQRT 

Packed Floating-Point 
Reciprocal Square Root 
Approximation 

3 



3DNow 



Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

PFSUB 

Packed Floating-Point 
Subtract 

3 



3DNow 



PFSUBR 

Packed Floating-Point 
Subtract Reverse 

3 



3DNow 



PHADDD 

Packed Horizontal Add 

Doubleword 

3 


SSSE3 




PHADDSW 

Packed Horizontal Add 

with Saturation Word 

3 


SSSE3 




PHADDW 

Packed Horizontal Add 

Word 

3 


SSSE3 




PHMINPOSUW 

Horizontal Minimum and 

Position 

3 


SSE41 




PHSUBD 

Packed Horizontal Subtract 

Doubleword 

3 


SSSE3 




PHSUBSW 

Packed Horizontal Subtract 

with Saturation Word 

3 


SSSE3 




PHSUBW 

Packed Horizontal Subtract 

Word 

3 


SSSE3 




PI2FD 

Packed Integer to Floating- 
Point Doubleword 

Conversion 

3 



3DNow 



PI2FW 

Packed Integer To 
Floating-Point Word 
Conversion 

3 



3DNowExt 



PINSRB 

Packed Insert Byte 

3 


SSE41 




PINSRD 

Packed Insert Doubleword 

3 


SSE41 




PINSRQ 

Packed Insert Quadword 

3 


SSE41 




PINSRW 

Packed Insert Word 

3 


SSE2 

SSE || 
MmxExt 



PMADDUBSW 

Packed Multiply and Add 
Unsigned Byte to Signed 
Word, 

3 


SSSE3 




PMADDWD 

Packed Multiply Words and 
Add Doublewords 

3 


SSE2 

MMX 



Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

PMAXSB 

Packed Maximum Signed 
Bytes 

3 


SSE41 




PMAXSD 

Packed Maximum Signed 
Doublewords 

3 


SSE41 




PMAXSW 

Packed Maximum Signed 
Words 

3 


SSE2 

SSE || 
MmxExt 



PMAXUB 

Packed Maximum 

Unsigned Bytes 

3 


SSE2 

SSE || 
MmxExt 



PMAXUD 

Packed Maximum 

Unsigned Doublewords 

3 


SSE41 




PMAXUW 

Packed Maximum 

Unsigned Words 

3 


SSE41 




PMINSB 

Packed Minimum Signed 
Bytes 

3 


SSE41 




PMINSD 

Packed Minimum Signed 
Doublewords 

3 


SSE41 




PMINSW 

Packed Minimum Signed 
Words 

3 


SSE2 

SSE || 
MmxExt 



PMINUB 

Packed Minimum 

Unsigned Bytes 

3 


SSE2 

SSE || 
MmxExt 



PMINUD 

Packed Minimum 

Unsigned Doublewords 

3 


SSE41 




PMINUW 

Packed Minimum 

Unsigned Words 

3 


SSE41 




PMOVMSKB 

Packed Move Mask Byte 

3 


SSE2 

SSE || 
MmxExt 



PMOVSXBD 

Packed Move with Sign- 
Extension Byte to 
Doubleword 

3 


SSE41 




PMOVSXBQ 

Packed Move with Sign 
Extension Byte to 

Quadword 

3 


SSE41 




PMOVSXBW 

Packed Move with Sign 
Extension Byte to Word 

3 


SSE41 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

PMOVSXDQ 

Packed Move with Sign- 
Extension Doubleword to 
Quadword 

3 


SSE41 




PMOVSXWD 

Packed Move with Sign- 
Extension Word to 

Doubleword 

3 


SSE41 




PMOVSXWQ 

Packed Move with Sign- 
Extension Word to 

Quadword 

3 


SSE41 




PMOVZXBD 

Packed Move with Zero- 
Extension Byte to 
Doubleword 

3 


SSE41 




PMOVZXBQ 

Packed Move Byte to 
Quadword with Zero- 

Extension 

3 


SSE41 




PMOVZXBW 

Packed Move Byte to Word 
with Zero-Extension 

3 


SSE41 




PMOVZXDQ 

Packed Move with Zero- 

Extension Doubleword to 

Quadword 

3 


SSE41 




PMOVZXWD 

Packed Move Word to 

Doubleword with Zero- 

Extension 

3 


SSE41 




PMOVZXWQ 

Packed Move with Zero- 

Extension Word to 

Quadword 

3 


SSE41 




PMULDQ 

Packed Multiply Signed 
Doubleword to Quadword 

3 


SSE41 




PMULHRSW 

Packed Multiply High with 
Round and Scale Words 

3 


SSSE3 




PMULHRW 

Packed Multiply High 
Rounded Word 

3 



3DNow 



PMULHUW 

Packed Multiply High 
Unsigned Word 

3 


SSE2 

SSE || 
MmxExt 



PMULHW 

Packed Multiply High 

Signed Word 

3 


SSE2 

MMX 



Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

PMULLD 

Packed Multiply and Store 
Low Signed Doubleword 

3 


SSE41 




PMULLW 

Packed Multiply Low 

Signed Word 

3 


SSE2 

MMX 



PMULUDQ 

Packed Multiply Unsigned 
Doubleword and Store 

Ouadword 

3 


SSE2 

SSE2 



POP 

Pop Stack 

3 

Base 





POPA 

Pop All to GPR Words 

3 

Base 





POPAD 

Pop All to GPR 
Doublewords 

3 

Base 





POPCNT 

Bit Population Count 

3 

Base 





POPF 

Pop to FLAGS Word 

3 

Base 





POPFD 

Pop to EFLAGS 

Doubleword 

3 

Base 





POPFQ 

Pop to RFLAGS Ouadword 

3 

LM 





POR 

Packed Logical Bitwise OR 

3 


SSE2 

MMX 



PREFETCH 

Prefetch LI Data-Cache 

Line 

3 

3DNow || 
3DNowPre- 
fetch || LM 





PREFETCH/eve/ 

Prefetch Data to Cache 

Level level 

3 

SSE || 
MmxExt 





PREFETCHW 

Prefetch LI Data-Cache 

Line for Write 

3 

3DNow || 
3DNowPre- 
fetch || LM 





PSADBW 

Packed Sum of Absolute 
Differences of Bytes into a 
Word 

3 


SSE2 

SSE || 
MmxExt 



PSHUFB 

Packed Shuffle Byte 

3 


SSSE3 




PSHUFD 

Packed Shuffle 

Doublewords 

3 


SSE2 




PSHUFHW 

Packed Shuffle High 

Words 

3 


SSE2 




PSHUFLW 

Packed Shuffle Low Words 

3 


SSE2 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

PSHUFW 

Packed Shuffle Words 

3 



SSE || 
MmxExt 



PSIGNB 

Packed Sign Byte 

3 


SSSE3 




PSIGND 

Packed Sign Doubleword 

3 


SSSE3 




PSIGNW 

Packed Sign Word 

3 


SSSE3 




PSLLD 

Packed Shift Left Logical 
Doublewords 

3 


SSE2 

MMX 



PSLLDQ 

Packed Shift Left Logical 
Double Quadword 

3 


SSE2 




PSLLQ 

Packed Shift Left Logical 
Quadwords 

3 


SSE2 

MMX 



PSLLW 

Packed Shift Left Logical 
Words 

3 


SSE2 

MMX 



PSRAD 

Packed Shift Right 
Arithmetic Doublewords 

3 


SSE2 

MMX 



PS RAW 

Packed Shift Right 
Arithmetic Words 

3 


SSE2 

MMX 



PSRLD 

Packed Shift Right Logical 
Doublewords 

3 


SSE2 

MMX 



PSRLDQ 

Packed Shift Right Logical 
Double Quadword 

3 


SSE2 




PSRLQ 

Packed Shift Right Logical 
Quadwords 

3 


SSE2 

MMX 



PSRLW 

Packed Shift Right Logical 
Words 

3 


SSE2 

MMX 



PSUBB 

Packed Subtract Bytes 

3 


SSE2 

MMX 



PSUBD 

Packed Subtract 

Doublewords 

3 


SSE2 

MMX 



PSUBQ 

Packed Subtract 

Quadword 

3 


SSE2 

SSE2 



PSUBSB 

Packed Subtract Signed 
With Saturation Bytes 

3 


SSE2 

MMX 



PSUBSW 

Packed Subtract Signed 
with Saturation Words 

3 


SSE2 

MMX 



Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

PSUBUSB 

Packed Subtract Unsigned 
and Saturate Bytes 

3 


SSE2 

MMX 



PSUBUSW 

Packed Subtract Unsigned 
and Saturate Words 

3 


SSE2 

MMX 



PSUBW 

Packed Subtract Words 

3 


SSE2 

MMX 



PSWAPD 

Packed Swap Doubleword 

3 



3DNowExt 



PTEST 

Packed Bit Test 

3 


SSE41 




PUNPCKHBW 

Unpack and Interleave 

High Bytes 

3 


SSE2 

MMX 



PUNPCKHDQ 

Unpack and Interleave 

High Doublewords 

3 


SSE2 

MMX 



PUNPCKHQDQ 

Unpack and Interleave 

High Quadwords 

3 


SSE2 




PUNPCKHWD 

Unpack and Interleave 

High Words 

3 


SSE2 

MMX 



PUNPCKLBW 

Unpack and Interleave Low 
Bytes 

3 


SSE2 

MMX 



PUNPCKLDQ 

Unpack and Interleave Low 
Doublewords 

3 


SSE2 

MMX 



PUNPCKLQDQ 

Unpack and Interleave Low 
Quadwords 

3 


SSE2 




PUNPCKLWD 

Unpack and Interleave Low 
Words 

3 


SSE2 

3DNow 



PUSH 

Push onto Stack 

3 

Base 





PUSHA 

Push All GPR Words onto 

Stack 

3 

Base 





PUSHAD 

Push All GPR 

Doublewords onto Stack 

3 

Base 





PUSHF 

Push EFLAGS Word onto 

Stack 

3 

Base 





PUSHFD 

Push EFLAGS Doubleword 

onto Stack 

3 

Base 





PUSHFQ 

Push RFLAGS Quadword 

onto Stack 

3 

LM 





Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

PXOR 

Packed Logical Bitwise 
Exclusive OR 

3 


SSE2 

MMX 



RCL 

Rotate Through Carry Left 

3 

Base 





RCPPS 

Reciprocal Packed Single- 
Precision Floating-Point 

3 


SSE 




RCPSS 

Reciprocal Scalar Single- 
Precision Floating-Point 

3 


SSE 




RCR 

Rotate Through Carry 

Right 

3 

Base 





RDFSBASE 

Read FS.base 

3 





FSGSBASE 

RDGSBASE 

Read GS.base 

3 





FSGSBASE 

RDMSR 

Read Model-Specific 
Register 

0 





MSR 

RDPMC 

Read Performance- 
Monitoring Counter 

3 





Base 

RDTSC 

Read Time-Stamp Counter 

3 





TSC 

RDTSCP 

Read Time-Stamp Counter 
and Processor ID 

3 





TSC || 
RDTSCP 

RET 

Return from Call 

3 

Base 





ROL 

Rotate Left 

3 

Base 





ROR 

Rotate Right 

3 

Base 





RORX 

Rotate Right Extended 

3 

BMI2 





ROUNDPD 

Round Packed Double- 
Precision Floating-Point 

3 


SSE41 




ROUNDPS 

Round Packed Single- 
Precision Floating-Point 

3 


SSE41 




ROUNDSD 

Round Scalar Double- 
Precision Floating-Point 

3 


SSE41 




ROUNDSS 

Round Scalar Single- 
Precision Floating-Point 

3 


SSE41 




RSM 

Resume from System 
Management Mode 

3 





Base 

Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

RSQRTPS 

Reciprocal Square Root 
Packed Single-Precision 
Floating-Point 

3 


SSE 




RSQRTSS 

Reciprocal Square Root 
Scalar Single-Precision 
Floating-Point 

3 


SSE 




SAHF 

Store AH into Flags 

3 

LahfSahf 





SAL 

Shift Arithmetic Left 

3 

Base 





SAR 

Shift Arithmetic Right 

3 

Base 





SARX 

Shift Right Arithmetic 
Extended 

3 

BMI2 





SBB 

Subtract with Borrow 

3 

Base 





SCAS 

Scan String 

3 

Base 





SCASB 

Scan String as Bytes 

3 

Base 





SCASD 

Scan String as Doubleword 

3 

Base 





SCASQ 

Scan String as Quadword 

3 

LM 





SCASW 

Scan String as Words 

3 

Base 





SETcc 

Set Byte if Condition 

3 

Base 





SFENCE 

Store Fence 

3 

SSE || 
MmxExt 





SGDT 

Store Global Descriptor 
Table Register 

3 





Base 

SHL 

Shift Left 

3 

Base 





SHLD 

Shift Left Double 

3 

Base 





SHLX 

Shift Left Logical Extended 

3 

BMI2 





SHR 

Shift Right 

3 

Base 





SHRD 

Shift Right Double 

3 

Base 





SHRX 

Shift Right Logical 

Extended 

3 

BMI2 





SHUFPD 

Shuffle Packed Double- 
Precision Floating-Point 

3 


SSE2 




SHUFPS 

Shuffle Packed Single- 
Precision Floating-Point 

3 


SSE 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 


572 


Instruction Subsets and CPUID Feature Flags 




24594 — Rev. 3.28—September 2019 


AMPS 

AMD64 Technology 


Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

SIDT 

Store Interrupt Descriptor 
Table Register 

3 





Base 

SKINIT 

Secure Init and Jump with 
Attestation 

0 





SKINIT 

SLDT 

Store Local Descriptor 

Table Register 

3 





Base 

SMSW 

Store Machine Status Word 

3 





Base 

SQRTPD 

Square Root Packed 
Double-Precision Floating- 
Point 

3 


SSE2 




SQRTPS 

Square Root Packed 
Single-Precision Floating- 
Point 

3 


SSE 




SQRTSD 

Square Root Scalar 
Double-Precision Floating- 
Point 

3 


SSE2 




SQRTSS 

Square Root Scalar Single- 
Precision Floating-Point 

3 


SSE 




STC 

Set Carry Flag 

3 

Base 





STD 

Set Direction Flag 

3 

Base 





STGI 

Set Global Interrupt Flag 

0 





SKINIT 

STI 

Set Interrupt Flag 

3 





Base 

STMXCSR 

Store MXCSR 
Control/Status Register 

3 


SSE 




STOS 

Store String 

3 

Base 





STOSB 

Store String Bytes 

3 

Base 





STOSD 

Store String Doublewords 

3 

Base 





STOSQ 

Store String Quadwords 

3 

LM 





STOSW 

Store String Words 

3 

Base 





STR 

Store Task Register 

3 





Base 

SUB 

Subtract 

3 

Base 





SUBPD 

Subtract Packed Double- 
Precision Floating-Point 

3 


SSE2 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

SUBPS 

Subtract Packed Single- 
Precision Floating-Point 

3 


SSE 




SUBSD 

Subtract Scalar Double- 
Precision Floating-Point 

3 


SSE2 




SUBSS 

Subtract Scalar Single- 
Precision Floating-Point 

3 


SSE 




SWAPGS 

Swap GS Register with 
KerneIGSbase MSR 

0 





LM 

SYSCALL 

Fast System Call 

3 





SYSCALL, 

SYSRET 

SYSENTER 

System Call 

3 





SYSEN¬ 

TER, 

SYSEXIT 

SYSEXIT 

System Return 

0 





SYSEN¬ 

TER, 

SYSEXIT 

SYSRET 

Fast System Return 

0 





SYSCALL, 

SYSRET 

T1MSKC 

Inverse Mask From Trailing 
Ones 

3 

TBM 





TEST 

Test Bits 

3 

Base 





TZCNT 

Count Trailing Zeros 

3 

BMI1 





TZMSK 

Mask From Trailing Zeros 

3 

TBM 





UCOMISD 

Unordered Compare 

Scalar Double-Precision 
Floating-Point 

3 


SSE2 




UCOMISS 

Unordered Compare 

Scalar Single-Precision 
Floating-Point 

3 


SSE 




UD2 

Undefined Operation 

3 





Base 

UNPCKHPD 

Unpack High Double- 
Precision Floating-Point 

3 


SSE2 




UNPCKHPS 

Unpack High Single- 
Precision Floating-Point 

3 


SSE 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

UNPCKLPD 

Unpack Low Double- 
Precision Floating-Point 

3 


SSE2 




UNPCKLPS 

Unpack Low Single- 
Precision Floating-Point 

3 


SSE 




VADDPD 

Add Packed Double- 
Precision Floating-Point 

3 


AVX 




VADDPS 

Add Packed Single- 
Precision Floating-Point 

3 


AVX 




VADDSD 

Add Scalar Double- 
Precision Floating-Point 

3 


AVX 




VADDSS 

Add Scalar Single- 
Precision Floating-Point 

3 


AVX 




VADDSUBPD 

Add and Subtract Double- 

Precision 

3 


AVX 




VADDSUBPS 

Add and Subtract Single- 
Precision 

3 


AVX 




VAESDEC 

AES Decryption Round 

3 


AVX 




VAESDECLAST 

AES Last Decryption 

Round 

3 


AVX 




VAESENC 

AES Encryption Round 

3 


AVX 




VAESENCLAST 

AES Last Encryption 

Round 

3 


AVX 




VAESIMC 

AES InvMixColumn 

Transformation 

3 


AVX 




VAESKEYGENASSIST 

AES Assist Round Key 
Generation 

3 


AVX 




VANDNPD 

AND NOT Packed Double- 
Precision Floating-Point 

3 


AVX 




VANDNPS 

AND NOT Packed Single- 
Precision Floating-Point 

3 


AVX 




VANDPD 

AND Packed Double- 
Precision Floating-Point 

3 


AVX 




VANDPS 

AND Packed Single- 
Precision Floating-Point 

3 


AVX 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VBLENDPD 

Blend Packed Double- 
Precision Floating-Point 

3 


AVX 




VBLENDPS 

Blend Packed Single- 
Precision Floating-Point 

3 


AVX 




VBLENDVPD 

Variable Blend Packed 
Double-Precision Floating- 
Point 

3 


AVX 




VBLENDVPS 

Variable Blend Packed 
Single-Precision Floating- 
Point 

3 


AVX 




VBROADCASTF128 

Load With Broadcast From 
128-bit Memory Location 

3 


AVX 




VBROADCASTI128 

Load With Broadcast Integer 

From 128-bit Memory Location 

3 


AVX2 




VBROADCASTSD 

Load With Broadcast From 
64-Bit Memory Location 

3 


AVX, AVX2 4 




VBROADCASTSS 

Load With Broadcast From 
32-Bit Memory Location 

3 


AVX, AVX2 4 




VCMPPD 

Compare Packed Double- 
Precision Floating-Point 

3 


AVX 




VCMPPS 

Compare Packed Single- 
Precision Floating-Point 

3 


AVX 




VCMPSD 

Compare Scalar Double- 
Precision Floating-Point 

3 


AVX 




VCMPSS 

Compare Scalar Single- 
Precision Floating-Point 

3 


AVX 




VCOMISD 

Compare Ordered Scalar 
Double-Precision Floating- 
Point 

3 


AVX 




VCOMISS 

Compare Ordered Scalar 
Single-Precision Floating- 
Point 

3 


AVX 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VCVTDQ2PD 

Convert Packed 

Doubleword Integers to 
Packed Double-Precision 
Floating-Point 

3 


AVX 




VCVTDQ2PS 

Convert Packed 

Doubleword Integers to 
Packed Single-Precision 
Floating-Point 

3 


AVX 




VCVTPD2DQ 

Convert Packed Double- 
Precision Floating-Point to 
Packed Doubleword 
Integers 

3 


AVX 




VCVTPD2PS 

Convert Packed Double- 
Precision Floating-Point to 
Packed Single-Precision 
Floating-Point 

3 


AVX 




VCVTPH2PS 

Convert Packed 16-Bit 
Floating-Point to Single- 
Precision Floating-Point 

3 


AVX 




VCVTPS2DQ 

Convert Packed Single- 
Precision Floating-Point to 
Packed Doubleword 
Integers 

3 


AVX 




VCVTPS2PD 

Convert Packed Single- 
Precision Floating-Point to 
Packed Double-Precision 
Floating-Point 

3 


AVX 




VCVTPS2PH 

Convert Packed Single- 
Precision Floating-Point to 
16-Bit Floating-Point 

3 


AVX 




VCVTSD2SI 

Convert Scalar Double- 
Precision Floating-Point to 
Signed Doubleword or 
Quadword Integer 

3 


AVX 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VCVTSD2SS 

Convert Scalar Double- 
Precision Floating-Point to 
Scalar Single-Precision 
Floating-Point 

3 


AVX 




VCVTSI2SD 

Convert Signed 

Doubleword or Quadword 
Integer to Scalar Double- 
Precision Floating-Point 

3 


AVX 




VCVTSI2SS 

Convert Signed 

Doubleword or Quadword 
Integer to Scalar Single- 
Precision Floating-Point 

3 


AVX 




VCVTSS2SD 

Convert Scalar Single- 
Precision Floating-Point to 
Scalar Double-Precision 
Floating-Point 

3 


AVX 




VCVTSS2SI 

Convert Scalar Single- 
Precision Floating-Point to 
Signed Doubleword or 
Quadword Integer 

3 


AVX 




VCVTTPD2DQ 

Convert Packed Double- 
Precision Floating-Point to 
Packed Doubleword 
Integers, Truncated 

3 


AVX 




VCVTTPS2DQ 

Convert Packed Single- 
Precision Floating-Point to 
Packed Doubleword 
Integers, Truncated 

3 


AVX 




VCVTTSD2SI 

Convert Scalar Double- 
Precision Floating-Point to 
Signed Doubleword or 
Quadword Integer, 
Truncated 

3 


AVX 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VCVTTSS2SI 

Convert Scalar Single- 
Precision Floating-Point to 
Signed Doubleword or 
Quadword Integer, 
Truncated 

3 


AVX 




VDIVPD 

Divide Packed Double- 
Precision Floating-Point 

3 


AVX 




VDIVPS 

Divide Packed Single- 
Precision Floating-Point 

3 


AVX 




VDIVSD 

Divide Scalar Double- 
Precision Floating-Point 

3 


AVX 




VDIVSS 

Divide Scalar Single- 
Precision Floating-Point 

3 


AVX 




VDPPD 

Dot Product Packed 
Double-Precision Floating- 
Point 

3 


AVX 




VDPPS 

Dot Product Packed 
Single-Precision Floating- 
Point 

3 


AVX 




VERR 

Verify Segment for Reads 

3 





Base 

VERW 

Verify Segment for Writes 

3 





Base 

VEXTRACTF128 

Extract Packed Values 
from 128-bit Memory 
Location 

3 


AVX 




VEXTRACTI128 

Extract 128-bit Integer 

3 


AVX2 




VEXTRACTPS 

Extract Packed Single- 
Precision Floating-Point 

3 


AVX 




VFMADDPD 

Multiply and Add Packed 
Double-Precision Floating- 
Point 

3 


FMA4 




VFMADD132PD 

Multiply and Add Packed 
Double-Precision Floating- 
Point 

3 


FMA 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VFMADD213PD 

Multiply and Add Packed 
Double-Precision Floating- 
Point 

3 


FMA 




VFMADD231PD 

Multiply and Add Packed 
Double-Precision Floating- 
Point 

3 


FMA 




VFMADDPS 

Multiply and Add Packed 
Single-Precision Floating- 
Point 

3 


FMA4 




VFMADD132PS 

Multiply and Add Packed 
Single-Precision Floating- 
Point 

3 


FMA 




VFMADD213PS 

Multiply and Add Packed 
Single-Precision Floating- 
Point 

3 


FMA 




VFMADD231PS 

Multiply and Add Packed 
Single-Precision Floating- 
Point 

3 


FMA 




VFMADDSD 

Multiply and Add Scalar 
Double-Precision Floating- 
Point 

3 


FMA4 




VFMADD132SD 

Multiply and Add Scalar 
Double-Precision Floating- 
Point 

3 


FMA 




VFMADD213SD 

Multiply and Add Scalar 
Double-Precision Floating- 
Point 

3 


FMA 




VFMADD231SD 

Multiply and Add Scalar 
Double-Precision Floating- 
Point 

3 


FMA 




VFMADDSS 

Multiply and Add Scalar 
Single-Precision Floating- 
Point 

3 


FMA4 




VFMADD132SS 

Multiply and Add Scalar 
Single-Precision Floating- 
Point 

3 


FMA 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VFMADD213SS 

Multiply and Add Scalar 
Single-Precision Floating- 
Point 

3 


FMA 




VFMADD231SS 

Multiply and Add Scalar 
Single-Precision Floating- 
Point 

3 


FMA 




VFMADDSUBPD 

Multiply with Alternating 
Add/Subtract Packed 
Double-Precision Floating- 
Point 

3 


FMA4 




VFMADDSUB132PD 

Multiply with Alternating 
Add/Subtract Packed 
Double-Precision Floating- 
Point 

3 


FMA 




VFMADDSUB213PD 

Multiply with Alternating 
Add/Subtract Packed 
Double-Precision Floating- 
Point 

3 


FMA 




VFMADDSUB231PD 

Multiply with Alternating 
Add/Subtract Packed 
Double-Precision Floating- 
Point 

3 


FMA 




VFMADDSUBPS 

Multiply with Alternating 
Add/Subtract Packed 
Single-Precision Floating- 
Point 

3 


FMA4 




VFMADDSUB132PS 

Multiply with Alternating 
Add/Subtract Packed 
Single-Precision Floating- 
Point 

3 


FMA 




VFMADDSUB213PS 

Multiply with Alternating 
Add/Subtract Packed 
Single-Precision Floating- 
Point 

3 


FMA 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VFMADDSUB231PS 

Multiply with Alternating 
Add/Subtract Packed 
Single-Precision Floating- 
Point 

3 


FMA 




VFMSUBADDPD 

Multiply with Alternating 
Subtract/Add Packed 
Double-Precision Floating- 
Point 

3 


FMA4 




VFMSUBADD132PD 

Multiply with Alternating 
Subtract/Add Packed 
Double-Precision Floating- 
Point 

3 


FMA 




VFMSUBADD213PD 

Multiply with Alternating 
Subtract/Add Packed 
Double-Precision Floating- 
Point 

3 


FMA 




VFMSUBADD231PD 

Multiply with Alternating 
Subtract/Add Packed 
Double-Precision Floating- 
Point 

3 


FMA 




VFMSUBADDPS 

Multiply with Alternating 
Subtract/Add Packed 
Single-Precision Floating- 
Point 

3 


FMA4 




VFMSUBADD132PS 

Multiply with Alternating 
Subtract/Add Packed 
Single-Precision Floating- 
Point 

3 


FMA 




VFMSUBADD213PS 

Multiply with Alternating 
Subtract/Add Packed 
Single-Precision Floating- 
Point 

3 


FMA 




VFMSUBADD231PS 

Multiply with Alternating 
Subtract/Add Packed 
Single-Precision Floating- 
Point 

3 


FMA 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VFMSUBPD 

Multiply and Subtract 
Packed Double-Precision 
Floating-Point 

3 


FMA4 




VFMSUB132PD 

Multiply and Subtract 
Packed Double-Precision 
Floating-Point 

3 


FMA 




VFMSUB213PD 

Multiply and Subtract 
Packed Double-Precision 
Floating-Point 

3 


FMA 




VFMSUB231PD 

Multiply and Subtract 
Packed Double-Precision 
Floating-Point 

3 


FMA 




VFMSUBPS 

Multiply and Subtract 
Packed Single-Precision 
Floating-Point 

3 


FMA4 




VFMSUB132PS 

Multiply and Subtract 
Packed Single-Precision 
Floating-Point 

3 


FMA 




VFMSUB213PS 

Multiply and Subtract 
Packed Single-Precision 
Floating-Point 

3 


FMA 




VFMSUB231PS 

Multiply and Subtract 
Packed Single-Precision 
Floating-Point 

3 


FMA 




VFMSUBSD 

Multiply and Subtract 

Scalar Double-Precision 
Floating-Point 

3 


FMA4 




VFMSUB132SD 

Multiply and Subtract 

Scalar Double-Precision 
Floating-Point 

3 


FMA 




VFMSUB213SD 

Multiply and Subtract 

Scalar Double-Precision 
Floating-Point 

3 


FMA 




VFMSUB231SD 

Multiply and Subtract 

Scalar Double-Precision 
Floating-Point 

3 


FMA 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VFMSUBSS 

Multiply and Subtract 

Scalar Single-Precision 
Floating-Point 

3 


FMA4 




VFMSUB132SS 

Multiply and Subtract 

Scalar Single-Precision 
Floating-Point 

3 


FMA 




VFMSUB213SS 

Multiply and Subtract 

Scalar Single-Precision 
Floating-Point 

3 


FMA 




VFMSUB231SS 

Multiply and Subtract 

Scalar Single-Precision 
Floating-Point 

3 


FMA 




VFNMADDPD 

Negative Multiply and Add 
Packed Double-Precision 
Floating-Point 

3 


FMA4 




VFNMADD132PD 

Negative Multiply and Add 
Packed Double-Precision 
Floating-Point 

3 


FMA 




VFNMADD213PD 

Negative Multiply and Add 
Packed Double-Precision 
Floating-Point 

3 


FMA 




VFNMADD231PD 

Negative Multiply and Add 
Packed Double-Precision 
Floating-Point 

3 


FMA 




VFNMADDPS 

Negative Multiply and Add 
Packed Single-Precision 
Floating-Point 

3 


FMA4 




VFNMADD132PS 

Negative Multiply and Add 
Packed Single-Precision 
Floating-Point 

3 


FMA 




VFNMADD213PS 

Negative Multiply and Add 
Packed Single-Precision 
Floating-Point 

3 


FMA 




VFNMADD231PS 

Negative Multiply and Add 
Packed Single-Precision 
Floating-Point 

3 


FMA 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VFNMADDSD 

Negative Multiply and Add 
Scalar Double-Precision 
Floating-Point 

3 


FMA4 




VFNMADD132SD 

Negative Multiply and Add 
Scalar Double-Precision 
Floating-Point 

3 


FMA 




VFNMADD213SD 

Negative Multiply and Add 
Scalar Double-Precision 
Floating-Point 

3 


FMA 




VFNMADD231SD 

Negative Multiply and Add 
Scalar Double-Precision 
Floating-Point 

3 


FMA 




VFNMADDSS 

Negative Multiply and Add 
Scalar Single-Precision 
Floating-Point 

3 


FMA4 




VFNMADD132SS 

Negative Multiply and Add 
Scalar Single-Precision 
Floating-Point 

3 


FMA 




VFNMADD213SS 

Negative Multiply and Add 
Scalar Single-Precision 
Floating-Point 

3 


FMA 




VFNMADD231SS 

Negative Multiply and Add 
Scalar Single-Precision 
Floating-Point 

3 


FMA 




VFNMSUBPD 

Negative Multiply and 
Subtract Packed Double- 
Precision Floating-Point 

3 


FMA4 




VFNMSUB132PD 

Negative Multiply and 
Subtract Packed Double- 
Precision Floating-Point 

3 


FMA 




VFNMSUB213PD 

Negative Multiply and 
Subtract Packed Double- 
Precision Floating-Point 

3 


FMA 




VFNMSUB231PD 

Negative Multiply and 
Subtract Packed Double- 
Precision Floating-Point 

3 


FMA 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VFNMSUBPS 

Negative Multiply and 
Subtract Packed Single- 
Precision Floating-Point 

3 


FMA4 




VFNMSUB132PS 

Negative Multiply and 
Subtract Packed Single- 
Precision Floating-Point 

3 


FMA 




VFNMSUB213PS 

Negative Multiply and 
Subtract Packed Single- 
Precision Floating-Point 

3 


FMA 




VFNMSUB231PS 

Negative Multiply and 
Subtract Packed Single- 
Precision Floating-Point 

3 


FMA 




VFNMSUBSD 

Negative Multiply and 
Subtract Scalar Double- 
Precision Floating-Point 

3 


FMA4 




VFNMSUB132SD 

Negative Multiply and 
Subtract Scalar Double- 
Precision Floating-Point 

3 


FMA 




VFNMSUB213SD 

Negative Multiply and 
Subtract Scalar Double- 
Precision Floating-Point 

3 


FMA 




VFNMSUB231SD 

Negative Multiply and 
Subtract Scalar Double- 
Precision Floating-Point 

3 


FMA 




VFNMSUBSS 

Negative Multiply and 
Subtract Scalar Single- 
Precision Floating-Point 

3 


FMA4 




VFNMSUB132SS 

Negative Multiply and 
Subtract Scalar Single- 
Precision Floating-Point 

3 


FMA 




VFNMSUB213SS 

Negative Multiply and 
Subtract Scalar Single- 
Precision Floating-Point 

3 


FMA 




VFNMSUB231SS 

Negative Multiply and 
Subtract Scalar Single- 
Precision Floating-Point 

3 


FMA 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VFRCZPD 

Extract Fraction Packed 
Double-Precision Floating- 
Point 

3 


XOP 




VFRCZPS 

Extract Fraction Packed 
Single-Precision Floating- 
Point 

3 


XOP 




VFRCZSD 

Extract Fraction Scalar 
Double-Precision Floating- 
Point 

3 


XOP 




VFRCZSS 

Extract Fraction Scalar 
Single-Precision Floating 
Point 

3 


XOP 




VGATHERDPD 

Conditionally Gather 
Double-Precision Floating- 
Point Values, Doubleword 
Indices 

3 


AVX2 




VGATHERDPS 

Conditionally Gather 
Single-Precision Floating- 
Point Values, Doubleword 
Indices 

3 


AVX2 




VGATHERQPD 

Conditionally Gather 
Double-Precision Floating- 
Point Values, Quadword 
Indices 

3 


AVX2 




VGATHERQPS 

Conditionally Gather 
Single-Precision Floating- 
Point Values, Quadword 
Indices 

3 


AVX2 




VHADDPD 

Horizontal Add Packed 

Double 

3 


AVX 




VHADDPS 

Horizontal Add Packed 
Single 

3 


AVX 




VHSUBPD 

Horizontal Subtract Packed 

Double 

3 


AVX 




VHSUBPS 

Horizontal Subtract Packed 
Single 

3 


AVX 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VINSERTF128 

Insert Packed Values 128- 

bit 

3 


AVX 




VINSERTI128 

Insert Packed Integer 
Values 128-bit 

3 


AVX2 




VINSERTPS 

Insert Packed Single- 
Precision Floating-Point 

3 


AVX 




VLDDQU 

Load Unaligned Double 
Quadword 

3 


AVX 




VLDMXCSR 

Load MXCSR 
Control/Status Register 

3 


AVX 




VMASKMOVDQU 

Masked Move Double 
Quadword Unaligned 

3 


AVX 




VMASKMOVPD 

Masked Move Packed 

Double-Precision 

3 


AVX 




VMASKMOVPS 

Masked Move Packed 
Single-Precision 

3 


AVX 




VMAXPD 

Maximum Packed Double- 
Precision Floating-Point 

3 


AVX 




VMAXPS 

Maximum Packed Single- 
Precision Floating-Point 

3 


AVX 




VMAXSD 

Maximum Scalar Double- 
Precision Floating-Point 

3 


AVX 




VMAXSS 

Maximum Scalar Single- 
Precision Floating-Point 

3 


AVX 




VMINPD 

Minimum Packed Double- 
Precision Floating-Point 

3 


AVX 




VMINPS 

Minimum Packed Single- 
Precision Floating-Point 

3 


AVX 




VMINSD 

Minimum Scalar Double- 
Precision Floating-Point 

3 


AVX 




VMINSS 

Minimum Scalar Single- 
Precision Floating-Point 

3 


AVX 




VMOVAPD 

Move Aligned Packed 
Double-Precision Floating- 
Point 

3 


AVX 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VMOVAPS 

Move Aligned Packed 
Single-Precision Floating- 
Point 

3 


AVX 




VMOVD 

Move Doubleword or 

Quadword 

3 


AVX 




VMOVDDUP 

Move Double-Precision 
and Duplicate 

3 


AVX 




VMOVDQA 

Move Aligned Double 
Quadword 

3 


AVX 




VMOVDQU 

Move Unaligned Double 
Quadword 

3 


AVX 




VMOVHLPS 

Move Packed Single- 
Precision Floating-Point 
High to Low 

3 


AVX 




VMOVHPD 

Move High Packed 
Double-Precision Floating- 
Point 

3 


AVX 




VMOVHPS 

Move High Packed Single- 
Precision Floating-Point 

3 


AVX 




VMOVLHPS 

Move Packed Single- 
Precision Floating-Point 

Low to High 

3 


AVX 




VMOVLPD 

Move Low Packed Double- 
Precision Floating-Point 

3 


AVX 




VMOVLPS 

Move Low Packed Single- 
Precision Floating-Point 

3 


AVX 




VMOVMSKPD 

Extract Packed Double- 
Precision Floating-Point 
Sign Mask 

3 


AVX 




VMOVMSKPS 

Extract Packed Single- 
Precision Floating-Point 
Sign Mask 

3 


AVX 




VMOVNTDQ 

Move Non-Temporal 

Double Quadword 

3 


AVX 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VMOVNTDQA 

Move Non-Temporal 

Double Quadword Aligned 

3 


AVX, AVX2 4 




VMOVNTPD 

Move Non-Temporal 

Packed Double-Precision 
Floating-Point 

3 


AVX 




VMOVNTPS 

Move Non-Temporal 

Packed Single-Precision 
Floating-Point 

3 


AVX 




VMOVQ 

Move Quadword 

3 


AVX 




VMOVSD 

Move Scalar Double- 
Precision Floating-Point 

3 


AVX 




VMOVSHDUP 

Move Single-Precision 

High and Duplicate 

3 


AVX 




VMOVSLDUP 

Move Single-Precision Low 
and Duplicate 

3 


AVX 




VMOVSS 

Move Scalar Single- 
Precision Floating-Point 

3 


AVX 




VMOVUPD 

Move Unaligned Packed 
Double-Precision Floating- 
Point 

3 


AVX 




VMOVUPS 

Move Unaligned Packed 
Single-Precision Floating- 
Point 

3 


AVX 




VMPSADBW 

Multiple Sum of Absolute 
Differences 

3 


AVX, AVX2 4 




VMULPD 

Multiply Packed Double- 
Precision Floating-Point 

3 


AVX 




VMULPS 

Multiply Packed Single- 
Precision Floating-Point 

3 


AVX 




VMULSD 

Multiply Scalar Double- 
Precision Floating-Point 

3 


AVX 




VMULSS 

Multiply Scalar Single- 
Precision Floating-Point 

3 


AVX 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VORPD 

Logical Bitwise OR Packed 
Double-Precision Floating- 
Point 

3 


AVX 




VORPS 

Logical Bitwise OR Packed 
Single-Precision Floating- 
Point 

3 


AVX 




VPABSB 

Packed Absolute Value 
Signed Byte 

3 


AVX, AVX2 4 




VPABSD 

Packed Absolute Value 
Signed Doubleword 

3 


AVX, AVX2 4 




VPABSW 

Packed Absolute Value 
Signed Word 

3 


AVX, AVX2 4 




VPACKSSDW 

Pack with Saturation 

Signed Doubleword to 

Word 

3 


AVX, AVX2 4 




VPACKSSWB 

Pack with Saturation 

Signed Word to Byte 

3 


AVX, AVX2 4 




VPACKUSDW 

Pack with Unsigned 
Saturation Doubleword to 

Word 

3 


AVX, AVX2 4 




VPACKUSWB 

Pack with Saturation 

Signed Word to Unsigned 
Byte 

3 


AVX, AVX2 4 




VPADDB 

Packed Add Bytes 

3 


AVX, AVX2 4 




VPADDD 

Packed Add Doublewords 

3 


AVX, AVX2 4 




VPADDQ 

Packed Add Quadwords 

3 


AVX, AVX2 4 




VPADDSB 

Packed Add Signed with 
Saturation Bytes 

3 


AVX, AVX2 4 




VPADDSW 

Packed Add Signed with 
Saturation Words 

3 


AVX, AVX2 4 




VPADDUSB 

Packed Add Unsigned with 
Saturation Bytes 

3 


AVX, AVX2 4 




VPADDUSW 

Packed Add Unsigned with 
Saturation Words 

3 


AVX, AVX2 4 




VPADDW 

Packed Add Words 

3 


AVX, AVX2 4 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VPALIGNR 

Packed Align Right 

3 


AVX, AVX2 4 




VPAND 

Packed Logical Bitwise 

AND 

3 


AVX, AVX2 4 




VPANDN 

Packed Logical Bitwise 

AND NOT 

3 


AVX, AVX2 4 




VPAVGB 

Packed Average Unsigned 
Bytes 

3 


AVX, AVX2 4 




VPAVGW 

Packed Average Unsigned 
Words 

3 


AVX, AVX2 4 




VPBLENDD 

Blend Packed 

Doublewords 

3 


AVX2 




VPBLENDVB 

Variable Blend Packed 
Bytes 

3 


AVX, AVX2 4 




VPBLENDW 

Blend Packed Words 

3 


AVX, AVX2 4 




VPBROADCASTB 

Broadcast Packed Byte 

3 


AVX2 




VPBROADCASTD 

Broadcast Packed 

Doubleword 

3 


AVX2 




VPBROADCASTQ 

Broadcast Packed 

Quadword 

3 


AVX2 




VPBROADCASTW 

Broadcast Packed Word 

3 


AVX2 




VPCLMULQDQ 

Carry-less Multiply 
Quadwords 

3 


CLMUL || 
AVX 




VPCMOV 

Vector Conditional Move 

3 


XOP 




VPCMPEQB 

Packed Compare Equal 
Bytes 

3 


AVX, AVX2 4 




VPCMPEQD 

Packed Compare Equal 
Doublewords 

3 


AVX, AVX2 4 




VPCMPEQQ 

Packed Compare Equal 
Ouadwords 

3 


AVX, AVX2 4 




VPCMPEQW 

Packed Compare Equal 
Words 

3 


AVX, AVX2 4 




VPCMPESTRI 

Packed Compare Explicit 
Length Strings Return 

Index 

3 


AVX 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VPCMPESTRM 

Packed Compare Explicit 
Length Strings Return 

Mask 

3 


AVX 




VPCMPGTB 

Packed Compare Greater 
Than Signed Bytes 

3 


AVX, AVX2 4 




VPCMPGTD 

Packed Compare Greater 
Than Signed Doublewords 

3 


AVX, AVX2 4 




VPCMPGTQ 

Packed Compare Greater 
Than Signed Quadwords 

3 


AVX, AVX2 4 




VPCMPGTW 

Packed Compare Greater 
Than Signed Words 

3 


AVX, AVX2 4 




VPCMPISTRI 

Packed Compare Implicit 
Length Strings Return 

Index 

3 


AVX 




VPCMPISTRM 

Packed Compare Implicit 
Length Strings Return 

Mask 

3 


AVX 




VPCOMB 

Compare Vector Signed 
Bytes 

3 


XOP 




VPCOMD 

Compare Vector Signed 
Doublewords 

3 


XOP 




VPCOMQ 

Compare Vector Signed 
Quadwords 

3 


XOP 




VPCOMUB 

Compare Vector Unsigned 
Bytes 

3 


XOP 




VPCOMUD 

Compare Vector Unsigned 
Doublewords 

3 


XOP 




VPCOMUQ 

Compare Vector Unsigned 
Quadwords 

3 


XOP 




VPCOMUW 

Compare Vector Unsigned 
Words 

3 


XOP 




VPCOMW 

Compare Vector Signed 
Words 

3 


XOP 




VPERM2F128 

Permute Floating-Point 
128-bit 

3 


AVX 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VPERM2I128 

Permute Integer 128-bit 

3 


AVX2 




VPERMD 

Packed Permute 

Doubleword 

3 


AVX2 




VPERMIL2PD 

Permute Two-Source 
Double-Precision Floating- 
Point 

3 


XOP 




VPERMIL2PS 

Permute Two-Source 
Single-Precision Floating- 
Point 

3 


XOP 




VPERMILPD 

Permute Double-Precision 

3 


AVX 




VPERMILPS 

Permute Single-Precision 

3 


AVX 




VPERMPD 

Packed Permute Double- 
Precision Floating-Point 

3 


AVX2 




VPERMPS 

Packed Permute Single- 
Precision Floating-Point 

3 


AVX2 




VPERMQ 

Packed Permute 

Quadword 

3 


AVX2 




VPEXTRB 

Extract Packed Byte 

3 


AVX 




VPEXTRD 

Extract Packed 

Doubleword 

3 


AVX 




VPEXTRQ 

Extract Packed Quadword 

3 


AVX 




VPEXTRW 

Packed Extract Word 

3 


AVX 




VPGATHERDD 

Conditionally Gather 
Doublewords, Doubleword 
Indices 

3 


AVX2 




VPGATHERDQ 

Conditionally Gather 
Quadwords, Doubleword 
Indices 

3 


AVX2 




VPGATHERQD 

Conditionally Gather 
Doublewords, Quadword 
Indices 

3 


AVX2 




VPGATHERQQ 

Conditionally Gather 
Quadwords, Quadword 
Indices 

3 


AVX2 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VPHADDBD 

Packed Horizontal Add 
Signed Byte to Signed 
Doubleword 

3 


XOP 




VPHADDBQ 

Packed Horizontal Add 
Signed Byte to Signed 
Quadword 

3 


XOP 




VPHADDBW 

Packed Horizontal Add 
Signed Byte to Signed 

Word 

3 


XOP 




VPHADDD 

Packed Horizontal Add 

Doubleword 

3 


AVX, AVX2 4 




VPHADDDQ 

Packed Horizontal Add 
Signed Doubleword to 
Signed Quadword 

3 


XOP 




VPHADDSW 

Packed Horizontal Add 

with Saturation Word 

3 


AVX, AVX2 4 




VPHADDUBD 

Packed Horizontal Add 
Unsigned Byte to 
Doubleword 

3 


XOP 




VPHADDUBQ 

Packed Horizontal Add 
Unsigned Byte to 

Quadword 

3 


XOP 




VPHADDUBW 

Packed Horizontal Add 
Unsigned Byte to Word 

3 


XOP 




VPHADDUDQ 

Packed Horizontal Add 
Unsigned Doubleword to 
Quadword 

3 


XOP 




VPHADDUWD 

Packed Horizontal Add 
Unsigned Word to 
Doubleword 

3 


XOP 




VPHADDUWQ 

Packed Horizontal Add 
Unsigned Word to 
Quadword 

3 


XOP 




VPHADDW 

Packed Horizontal Add 

Word 

3 


AVX, AVX2 4 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VPHADDWD 

Packed Horizontal Add 
Signed Word to Signed 
Doubleword 

3 


XOP 




VPHADDWQ 

Packed Horizontal Add 
Signed Word to Signed 
Quadword 

3 


XOP 




VPHMINPOSUW 

Horizontal Minimum and 

Position 

3 


AVX 




VPHSUBBW 

Packed Horizontal Subtract 
Signed Byte to Signed 

Word 

3 


XOP 




VPHSUBD 

Packed Horizontal Subtract 

Doubleword 

3 


AVX, AVX2 4 




VPHSUBDQ 

Packed Horizontal Subtract 
Signed Doubleword to 
Signed Quadword 

3 


XOP 




VPHSUBSW 

Packed Horizontal Subtract 

with Saturation Word 

3 


AVX, AVX2 4 




VPHSUBW 

Packed Horizontal Subtract 

Word 

3 


AVX, AVX2 4 




VPHSUBWD 

Packed Horizontal Subtract 
Signed Word to Signed 
Doubleword 

3 


XOP 




VPINSRB 

Packed Insert Byte 

3 


AVX 




VPINSRD 

Packed Insert Doubleword 

3 


AVX 




VPINSRQ 

Packed Insert Quadword 

3 


AVX 




VPINSRW 

Packed Insert Word 

3 


AVX 




VPMACSDD 

Packed Multiply 

Accumulate Signed 
Doubleword to Signed 
Doubleword 

3 


XOP 




VPMACSDQH 

Packed Multiply 

Accumulate Signed High 
Doubleword to Signed 
Quadword 

3 


XOP 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VPMACSDQL 

Packed Multiply 

Accumulate Signed Low 
Doubleword to Signed 
Quadword 

3 


XOP 




VPMACSSDD 

Packed Multiply 

Accumulate with Saturation 
Signed Doubleword to 
Signed Doubleword 

3 


XOP 




VPMACSSDQH 

Packed Multiply 

Accumulate with Saturation 
Signed High Doubleword 
to Signed Quadword 

3 


XOP 




VPMACSSDQL 

Packed Multiply 

Accumulate with Saturation 
Signed Low Doubleword to 
Signed Quadword 

3 


XOP 




VPMACSSWD 

Packed Multiply 

Accumulate with Saturation 
Signed Word to Signed 
Doubleword 

3 


XOP 




VPMACSSWW 

Packed Multiply 

Accumulate with Saturation 
Signed Word to Signed 
Word 

3 


XOP 




VPMACSWD 

Packed Multiply 

Accumulate Signed Word 
to Signed Doubleword 

3 


XOP 




VPMACSWW 

Packed Multiply 

Accumulate Signed Word 
to Signed Word 

3 


XOP 




VPMADCSSWD 

Packed Multiply Add 
Accumulate with Saturation 
Signed Word to Signed 
Doubleword 

3 


XOP 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VPMADCSWD 

Packed Multiply Add 
Accumulate Signed Word 
to Signed Doubleword 

3 


XOP 




VPMADDUBSW 

Packed Multiply and Add 
Unsigned Byte to Signed 
Word 

3 


AVX, AVX2 4 




VPMADDWD 

Packed Multiply Words and 
Add Doublewords 

3 


AVX, AVX2 4 




VPMASKMOVD 

Masked Move Packed 
Doubleword 

3 


AVX2 




VPMASKMOVQ 

Masked Move Packed 

Quadword 

3 


AVX2 




VPMAXSB 

Packed Maximum Signed 
Bytes 

3 


AVX, AVX2 4 




VPMAXSD 

Packed Maximum Signed 
Doublewords 

3 


AVX, AVX2 4 




VPMAXSW 

Packed Maximum Signed 
Words 

3 


AVX, AVX2 4 




VPMAXUB 

Packed Maximum 

Unsigned Bytes 

3 


AVX, AVX2 4 




VPMAXUD 

Packed Maximum 

Unsigned Doublewords 

3 


AVX, AVX2 4 




VPMAXUW 

Packed Maximum 

Unsigned Words 

3 


AVX, AVX2 4 




VPMINSB 

Packed Minimum Signed 
Bytes 

3 


AVX, AVX2 4 




VPMINSD 

Packed Minimum Signed 
Doublewords 

3 


AVX, AVX2 4 




VPMINSW 

Packed Minimum Signed 
Words 

3 


AVX, AVX2 4 




VPMINUB 

Packed Minimum 

Unsigned Bytes 

3 


AVX, AVX2 4 




VPMINUD 

Packed Minimum 

Unsigned Doublewords 

3 


AVX, AVX2 4 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VPMINUW 

Packed Minimum 

Unsigned Words 

3 


AVX, AVX2 4 




VPMOVMSKB 

Packed Move Mask Byte 

3 


AVX, AVX2 4 




VPMOVSXBD 

Packed Move with Sign- 
Extension Byte to 
Doubleword 

3 


AVX, AVX2 4 




VPMOVSXBQ 

Packed Move with Sign 
Extension Byte to 

Quadword 

3 


AVX, AVX2 4 




VPMOVSXBW 

Packed Move with Sign 
Extension Byte to Word 

3 


AVX, AVX2 4 




VPMOVSXDQ 

Packed Move with Sign- 
Extension Doubleword to 

Quadword 

3 


AVX, AVX2 4 




VPMOVSXWD 

Packed Move with Sign- 
Extension Word to 

Doubleword 

3 


AVX, AVX2 4 




VPMOVSXWQ 

Packed Move with Sign- 
Extension Word to 

Quadword 

3 


AVX, AVX2 4 




VPMOVZXBD 

Packed Move with Zero- 
Extension Byte to 
Doubleword 

3 


AVX, AVX2 4 




VPMOVZXBQ 

Packed Move Byte to 
Quadword with Zero- 

Extension 

3 


AVX, AVX2 4 




VPMOVZXBW 

Packed Move Byte to Word 
with Zero-Extension 

3 


AVX, AVX2 4 




VPMOVZXDQ 

Packed Move with Zero- 

Extension Doubleword to 

Quadword 

3 


AVX, AVX2 4 




VPMOVZXWD 

Packed Move Word to 

Doubleword with Zero- 

Extension 

3 


AVX, AVX2 4 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VPMOVZXWQ 

Packed Move with Zero- 

Extension Word to 
Quadword 

3 


AVX, AVX2 4 




VPMULDQ 

Packed Multiply Signed 
Doubleword to Quadword 

3 


AVX, AVX2 4 




VPMULHRSW 

Packed Multiply High with 
Round and Scale Words 

3 


AVX, AVX2 4 




VPMULHUW 

Packed Multiply High 
Unsigned Word 

3 


AVX, AVX2 4 




VPMULHW 

Packed Multiply High 

Signed Word 

3 


AVX, AVX2 4 




VPMULLD 

Packed Multiply and Store 
Low Signed Doubleword 

3 


AVX, AVX2 4 




VPMULLW 

Packed Multiply Low 

Signed Word 

3 


AVX, AVX2 4 




VPMULUDQ 

Packed Multiply Unsigned 
Doubleword and Store 

Quadword 

3 


AVX, AVX2 4 




VPOR 

Packed Logical Bitwise OR 

3 


AVX, AVX2 4 




VPPERM 

Packed Permute Bytes 

3 


XOP 




VPROTB 

Packed Rotate Bytes 

3 


XOP 




VPROTD 

Packed Rotate 

Doublewords 

3 


XOP 




VPROTQ 

Packed Rotate Quadwords 

3 


XOP 




VPROTW 

Packed Rotate Words 

3 


XOP 




VPSADBW 

Packed Sum of Absolute 
Differences of Bytes into a 
Word 

3 


AVX, AVX2 4 




VPSHAB 

Packed Shift Arithmetic 
Bytes 

3 


XOP 




VPSHAD 

Packed Shift Arithmetic 

Doublewords 

3 


XOP 




VPSHAQ 

Packed Shift Arithmetic 

Quadwords 

3 


XOP 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VPSHAW 

Packed Shift Arithmetic 

Words 

3 


XOP 




VPSHLB 

Packed Shift Logical Bytes 

3 


XOP 




VPSHLD 

Packed Shift Logical 
Doublewords 

3 


XOP 




VPSHLQ 

Packed Shift Logical 
Quadwords 

3 


XOP 




VPSHLW 

Packed Shift Logical 

Words 

3 


XOP 




VPSHUFB 

Packed Shuffle Byte 

3 


AVX, AVX2 4 




VPSHUFD 

Packed Shuffle 

Doublewords 

3 


AVX, AVX2 4 




VPSHUFHW 

Packed Shuffle High 

Words 

3 


AVX, AVX2 4 




VPSHUFLW 

Packed Shuffle Low Words 

3 


AVX, AVX2 4 




VPSIGNB 

Packed Sign Byte 

3 


AVX, AVX2 4 




VPSIGND 

Packed Sign Doubleword 

3 


AVX, AVX2 4 




VPSIGNW 

Packed Sign Word 

3 


AVX, AVX2 4 




VPSLLD 

Packed Shift Left Logical 
Doublewords 

3 


AVX, AVX2 4 




VPSLLDQ 

Packed Shift Left Logical 
Double Quadword 

3 


AVX, AVX2 4 




VPSLLQ 

Packed Shift Left Logical 
Quadwords 

3 


AVX, AVX2 4 




VPSLLVD 

Variable Shift Left Logical 
Doublewords 

3 


AVX2 




VPSLLVQ 

Variable Shift Left Logical 
Quadwords 

3 


AVX2 




VPSLLW 

Packed Shift Left Logical 
Words 

3 


AVX, AVX2 4 




VPSRAD 

Packed Shift Right 
Arithmetic Doublewords 

3 


AVX, AVX2 4 




VPSRAVD 

Variable Shift Right 
Arithmetic Doublewords 

3 


AVX2 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VPSRAW 

Packed Shift Right 
Arithmetic Words 

3 


AVX, AVX2 4 




VPSRLD 

Packed Shift Right Logical 
Doublewords 

3 


AVX, AVX2 4 




VPSRLDQ 

Packed Shift Right Logical 
Double Quadword 

3 


AVX, AVX2 4 




VPSRLQ 

Packed Shift Right Logical 
Quadwords 

3 


AVX, AVX2 4 




VPSRLVD 

Variable Shift Right Logical 
Doublewords 

3 


AVX2 




VPSRLVQ 

Variable Shift Right Logical 
Quadwords 

3 


AVX2 




VPSRLW 

Packed Shift Right Logical 
Words 

3 


AVX, AVX2 4 




VPSUBB 

Packed Subtract Bytes 

3 


AVX, AVX2 4 




VPSUBD 

Packed Subtract 

Doublewords 

3 


AVX, AVX2 4 




VPSUBQ 

Packed Subtract 

Quadword 

3 


AVX, AVX2 4 




VPSUBSB 

Packed Subtract Signed 
With Saturation Bytes 

3 


AVX, AVX2 4 




VPSUBSW 

Packed Subtract Signed 
with Saturation Words 

3 


AVX, AVX2 4 




VPSUBUSB 

Packed Subtract Unsigned 
and Saturate Bytes 

3 


AVX, AVX2 4 




VPSUBUSW 

Packed Subtract Unsigned 
and Saturate Words 

3 


AVX, AVX2 4 




VPSUBW 

Packed Subtract Words 

3 


AVX, AVX2 4 




VPTEST 

Packed Bit Test 

3 


AVX 




VPUNPCKHBW 

Unpack and Interleave 

High Bytes 

3 


AVX, AVX2 4 




VPUNPCKHDQ 

Unpack and Interleave 

High Doublewords 

3 


AVX, AVX2 4 




VPUNPCKHQDQ 

Unpack and Interleave 

High Quadwords 

3 


AVX, AVX2 4 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VPUNPCKHWD 

Unpack and Interleave 

High Words 

3 


AVX, AVX2 4 




VPUNPCKLBW 

Unpack and Interleave Low 
Bytes 

3 


AVX, AVX2 4 




VPUNPCKLDQ 

Unpack and Interleave Low 
Doublewords 

3 


AVX, AVX2 4 




VPUNPCKLQDQ 

Unpack and Interleave Low 
Quadwords 

3 


AVX, AVX2 4 




VPUNPCKLWD 

Unpack and Interleave Low 
Words 

3 


AVX, AVX2 4 




VPXOR 

Packed Logical Bitwise 
Exclusive OR 

3 


AVX, AVX2 4 




VRCPPS 

Reciprocal Packed Single- 
Precision Floating-Point 

3 


AVX 




VRCPSS 

Reciprocal Scalar Single- 
Precision Floating-Point 

3 


AVX 




VROUNDPD 

Round Packed Double- 
Precision Floating-Point 

3 


AVX 




VROUNDPS 

Round Packed Single- 
Precision Floating-Point 

3 


AVX 




VROUNDSD 

Round Scalar Double- 
Precision Floating-Point 

3 


AVX 




VROUNDSS 

Round Scalar Single- 
Precision Floating-Point 

3 


AVX 




VRSQRTPS 

Reciprocal Square Root 
Packed Single-Precision 
Floating-Point 

3 


AVX 




VRSQRTSS 

Reciprocal Square Root 
Scalar Single-Precision 
Floating-Point 

3 


AVX 




VSHUFPD 

Shuffle Packed Double- 
Precision Floating-Point 

3 


AVX 




VSHUFPS 

Shuffle Packed Single- 
Precision Floating-Point 

3 


AVX 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VSQRTPD 

Square Root Packed 
Double-Precision Floating- 
Point 

3 


AVX 




VSQRTPS 

Square Root Packed 
Single-Precision Floating- 
Point 

3 


AVX 




VSQRTSD 

Square Root Scalar 
Double-Precision Floating- 
Point 

3 


AVX 




VSQRTSS 

Square Root Scalar Single- 
Precision Floating-Point 

3 


AVX 




VSTMXCSR 

Store MXCSR 
Control/Status Register 

3 


AVX 




VSUBPD 

Subtract Packed Double- 
Precision Floating-Point 

3 


AVX 




VSUBPS 

Subtract Packed Single- 
Precision Floating-Point 

3 


AVX 




VSUBSD 

Subtract Scalar Double- 
Precision Floating-Point 

3 


AVX 




VSUBSS 

Subtract Scalar Single- 
Precision Floating-Point 

3 


AVX 




VMLOAD 

Load State from VMCB 

0 





SVM 

VMMCALL 

Call VMM 

0 





SVM 

VMRUN 

Run Virtual Machine 

0 





SVM 

VMSAVE 

Save State to VMCB 

0 





SVM 

VTESTPD 

Packed Bit Test 

3 


AVX 




VTESTPS 

Packed Bit Test 

3 


AVX 




VUCOMISD 

Unordered Compare 

Scalar Double-Precision 
Floating-Point 

3 


AVX 




VUCOMISS 

Unordered Compare 

Scalar Single-Precision 
Floating-Point 

3 


AVX 




VUNPCKHPD 

Unpack High Double- 
Precision Floating-Point 

3 


AVX 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

VUNPCKHPS 

Unpack High Single- 
Precision Floating-Point 

3 


AVX 




VUNPCKLPD 

Unpack Low Double- 
Precision Floating-Point 

3 


AVX 




VUNPCKLPS 

Unpack Low Single- 
Precision Floating-Point 

3 


AVX 




VXORPD 

Logical Bitwise Exclusive 
OR Packed Double- 
Precision Floating-Point 

3 


AVX 




VXORPS 

Logical Bitwise Exclusive 
OR Packed Single- 
Precision Floating-Point 

3 


AVX 




VZEROALL 

Zero All YMM Registers 

3 


AVX 




VZEROUPPER 

Zero All YMM Registers 
Upper 

3 


AVX 




WAIT 

Wait for x87 Floating-Point 
Exceptions 

3 




X87 


WBINVD 

Writeback and Invalidate 

Caches 

0 





Base 

WRFSBASE 

Write FS.base 

3 





FSGSBASE 

WRGSBASE 

Write GS.base 

3 





FSGSBASE 

WRMSR 

Write to Model-Specific 
Register 

0 





MSR 

XADD 

Exchange and Add 

3 

Base 





XCHG 

Exchange 

3 

Base 





XGETBV 

Get Extended Control 
Register Value 

3 


XSAVE 




XL AT 

Translate Table Index 

3 

Base 





XLATB 

Translate Table Index (No 
Operands) 

3 

Base 





XOR 

Exclusive OR 

3 

Base 





XORPD 

Logical Bitwise Exclusive 
OR Packed Double- 
Precision Floating-Point 

3 


SSE2 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Table D-2. Instruction Groups and CPUID Feature Flags 


Instruction 

Instruction Group 
and CPUID Feature Flag(s) 1 

Mnemonic 

Description 

CPL 

General- 

Purpose 

SSE 

64-Bit 

Media 

x87 

System 

XORPS 

Logical Bitwise Exclusive 
OR Packed Single- 
Precision Floating-Point 

3 


SSE 




XRSTOR 

Restore Extended States 

3 


XSAVE 




XSAVE 

Save Extended States 

3 


XSAVE 




XSAVEOPT 

Save Extended States 
Performance Optimized 

3 


XSAVEOPT 




XSETBV 

Set Extended Control 
Register Value 

3 


XSAVE 




Notes: 

1. Columns indicate the instruction groups. Entries indicate the CPUID feature flags(s) indicating support for that 
instruction. “Base” indicates that no feature flag exists and all processors support this instruction. 

2. Mnemonic is used for two different instructions. Assemblers can distinguish them by the number and type of oper¬ 
ands. 

3. MONITOR/MWAIT can execute at privilege level >0 if enabled. See instruction description. 

4. One or more features of the instruction (usually support for 256-bit operands) are only available if AVX2 is sup¬ 
ported. 
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Appendix E Obtaining Processor Information Via 

the CPUID Instruction 


This appendix specifies the infonnation that software can obtain about the processor on which it is 
running by executing the CPUID instruction. The information in this appendix supersedes the con¬ 
tents of the CPUID Specification, order #25481, which is now obsolete. 

The CPUID instruction is described on page 160. This appendix does not replace the CPUID 
instruction reference infonnation presented there. 

The CPUID instruction behaves much like a function call. Parameters are passed to the instruction via 
registers and on execution the instruction loads specific registers with return values. These return 
values can be interpreted by software based on the field definitions and their assigned meanings. 

The first input parameter is the function number which is passed to the instruction via the EAX 
register. Some functions also accept a second input parameter passed via the ECX register. Values are 
returned via the EAX, EBX, ECX, and EDX registers. Software should not assume that any values 
written to these registers prior to the execution of CPUID instruction will be retained after the 
instruction executes (even those that are marked reserved). 

The description of each return value breaks the value down into one or more named fields which 
represent a bit position or contiguous range of bits. All bit positions that are not defined as fields are 
reserved. The value of bits within reserved ranges cannot be relied upon to be zero. Software must 
mask off all reserved bits in the return value prior to making any value comparisons of represented 
infonnation. 

This appendix applies to all AMD processors with a family designation of OFh or greater. 

E.1 Special Notational Conventions 

The following special notation conventions are used in this appendix: 

• The notation (standard throughout this APM) for representing the function number, optional input 
parmeter, and the infonnation returned is as follows: 

CPUID Fn XXXX_XXXX_RRR[FieldName\_xYYY. 

Where: 

- XXXX_XXXX is the function number represented in hexadecimal (passed to the instruction in 
EAX). 

RRR is one of {EDX, ECX, EBX, EAX} and represents a register holding a return value. 

YYY represents the optional input parameter passed in the ECX register expressed as a 
hexadecimal number. If this parameter is not used, the characters represented by _xYYY are 
ommitted from the notation. 
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FieldName identifies a specific named element of processor infonnation represented by a 
specific bit range (1 or more bits wide) within the RRR register. 

• The notation CPUID Fn XXXX_XXXX_RRR is used when refering to one of the registers that holds 
infonnation returned by the instruction. 

• The notation CPUID F nXXXX_XXXX or F nXXXX_XXXX is used to refer to a specific function 
number. 

• Most one-bit fields indicate support or non-support of a specific processor feature. By convention, 
(unless otherwise noted) a value of 1 means that the feature is supported by the processor and a 
value of 0 means that the feature is not supported by the processor. 

E.2 Standard and Extended Function Numbers 

The CPUID instruction supports two sets or ranges of function numbers: standard and extended. 

• The smallest function number of the standard function range is FnOOOOOOOO. The largest function 
number of the standard function range, for a particular implementation, is returned in CPUID 
FnOOOOOOOOEAX. 

• The smallest function number of the extended function range is Fn8000_0000. The largest 
function number of the extended function range, for a particular implementation, is returned in 
CPUID Fn8000_0000_EAX. 

E.3 Standard Feature Function Numbers 

This section describes each of the defined CPUID functions in the standard range. 

E.3.1 Function Oh—Maximum Standard Function Number and Vendor String 

This function number provides infonnation about the maximum standard function number supported 

on this processor and a string that identifies the vendor of the product. 

CPUID Fn0000_0000_EAX Largest Standard Function Number 

The value returned in EAX provides the largest standard function number supported by this processor. 


Bits 

Field Name 

Description 

31:0 

LFuncStd 

Largest standard function. The largest CPUID standard function input value 
supported by the processor implementation. 


CPUID Fn0000_0000_E[D,C,B]X Processor Vendor 

The values returned in EBX, EDX, and ECX together provide a 12-character string identifying the 
vendor of this processor. Each register supplies 4 characters. The leftmost character of each substring 
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is stored in the least significant bit position in the register. The string is the concatenation of the 
contents of EBX, EDX, and ECX in left to right order. No null terminator is included in the string. 

CPUID Fn8000_0000_E[D,C,B]X return the same values as this function. 


Bits 

Field Name 

Description 

31:0 

Vendor 

Four characters of the 12-byte character string (encoded in ASCII) 

“AuthenticAMD”. See Table E-1 below. 


Table E-1. CPUID Fn0000_0000_E[D,C,B]X values 


Register 

Value 

Description 

CPUID FnOOOOOOOOEBX 

6874 7541h 

The ASCII characters “h t u A”. 

CPUID FnOOOOOOOOECX 

444D 4163h 

The ASCII characters “D M A c”. 

CPUID FnOOOOOOOOEDX 

6974_6E65h 

The ASCII characters “i t n e”. 


E.3.2 Function 1h—Processor and Processor Feature Identifiers 

This function number identifies the processor family, model, and stepping and provides feature 
support information. 

CPUID Fn0000 0001 EAX Family, Model, Stepping Identifiers 

The value returned in EAX provides the family, model, and stepping identifiers. Three values are used 
by software to identify a processor: Family, Model, and Stepping. 


Bits 

Field Name 

Description 

31:28 

— 

Reserved. 

27:20 

ExtFamily 

Processor extended family. See above for definition of Family[7:0]. 

19:16 

ExtModel 

Processor extended model. See above for definition of Model[7:0]. 

15:12 

— 

Reserved. 

11:8 

BaseFamily 

Base processor family. See above for definition of Family[7:0]. 

7:4 

BaseModel 

Base processor model. See above for definition of Model[7:0]. 

3:0 

Stepping 

Processor stepping. Processor stepping (revision) for a specific model. 


The processor Family identifies one or more processors as belonging to a group that possesses some 
common definition for software or hardware purposes. The Model specifies one instance of a 
processor family. The Stepping identifies a particular version of a specific model. Therefore, Family, 
Model and Stepping, when taken together, form a unique identification or signature for a processor. 

The Family is an 8-bit value and is defined as: Family[7:0] = ({0000b,BaseFamily[3:0]} + 
ExtFamily[7:0]). For example, ifBaseFamily[3:0] = Fh and ExtFamily[7:0] = Olh, thenFamily[7:0] = 
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lOh. If BaseFamily[3:0] is less than Fh, then ExtFamily is reserved and Family is equal to 
BaseFamily[3:0]. 

Model is an 8-bit value and is defined as: Model[7:0] = {ExtModel[3:0],BaseModel[3:0]}. For 
example, if ExtModel[3:0] = Eh and BaseModel[3:0] = 8h, then Model[7:0] = E8h. If BaseFamily[3:0] 
is less than OFh, then ExtModel is reserved and Model is equal to BaseModel[3:0]. 

The value returned by CPUID Fn8000_0001_EAX is equivalent to CPUID FnOOOOOOOlEAX. 

CPUID Fn0000 0001 EBX LocalApicId, LogicalProcessorCount, CLFlush 

The value returned in EBX provides miscellaneous information regarding the processor brand, the 
number of logical threads per processor socket, the CLFLUSH instruction, and APIC. 


Bits 

Field Name 

Description 

31:24 

LocalApicId 

Initial local APIC physical ID. The 8-bit value assigned to the local APIC physical ID 
register at power-up. Some of the bits of LocalApicId represent the core within a 
processor and other bits represent the processor ID. See the APIC20 “APIC ID” 
register in the processor BKDG for details. 

23:16 

LogicalProcessor 

Count 

Logical processor count. 

If CPUID Fn0000_0001_EDX[HTT] = 1 then LogicalProcessorCount is the number 
of cores per processor. 

If CPUID Fn0000_0001_EDX[HTT] = 0 then LogicalProcessorCount is reserved. 
See E.5.1 [Legacy Method]. 

15:8 

CLFlush 

CLFLUSH size. Specifies the size of a cache line in quadwords flushed by the 
CLFLUSH instruction. See “CLFLUSH” in APM3. 

7:0 

8BitBrandld 

8-bit brand ID. This field, in conjunction with CPUID Fn8000_0001_EBX[Brandld], 
is used by the system firmware to generate the processor name string. See the 
appropriate processor revision guide for how to program the processor name 
string. 


CPUID Fn0000 0001 ECX Feature Identifiers 

The value returned in ECX contains the following miscellaneous feature identifiers: 


Bits 

Field Name 

Description 

31 

— 

RAZ. Reserved for use by hypervisor to indicate guest status. 

30 

RDRAND 

RDRAND instruction support. 

29 

F16C 

Half-precision convert instruction support. See "Half-Precision Floating-Point 
Conversion" inAPMI and listings for individual F16C instructions inAPM5. 

28 

AVX 

AVX instruction support. See APM4. 

27 

OSXSAVE 

XSAVE (and related) instructions are enabled. See “OSXSAVE” in APM2. . 

26 

XSAVE 

XSAVE (and related) instructions are supported by hardware. See 
“XSAVE/XRSTOR Instructions” in APM2. 
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Bits 

Field Name 

Description 

25 

AES 

AES instruction support. See “AES Instructions” in APM4. 

24 

— 

Reserved. 

23 

POPCNT 

POPCNT instruction. See “POPCNT” in APM3. 

22 


MOVBE: MOVBE instruction support. 

21 

— 

Reserved. 

20 

SSE42 

SSE4.2 instruction support. "Determining Media and x87 Feature Support" in 

APM2 and individual SSE4.2 instruction listings in APM4. 

19 

SSE41 

SSE4.1 instruction support. See individual instruction listings in APM4. . 

18:14 

— 

Reserved. 

13 

CMPXCHG16B 

CMPXCHG16B instruction support. See “CMPXCHG16B” in APM3. 

12 

FMA 

FMA instruction support. 

11:10 

— 

Reserved. 

9 

SSSE3 

Supplemental SSE3 instruction support. 

8:4 

— 

Reserved. 

3 

MONITOR 

MONITOR/MWAIT instructions. See “MONITOR” and “MWAIT” in APM3. 

2 

— 

Reserved. 

1 

PCLMULQDQ 

PCLMULQDQ instruction support. See instruction reference page for the 
PCLMULQDQ / VPCLMULQDQ instruction in APM4. 

0 

SSE3 

SSE3 instruction support. See Appendix D “Instruction Subsets and CPUID 

Feature Sets” in APM3 for the list of instructions covered by the SSE3 feature bit. 
See APM4 for the definition of the SSE3 instructions. 


CPUID FnOOOO 0001 EDX Feature Identifiers 


The value returned in EDX contains the following miscellaneous feature identifiers: 


Bits 

Field Name 

Description 

31:29 

— 

Reserved. 

28 

HTT 

Hyper-threading technology. Indicates either that there is more than one thread per 
core or more than one core per processor. See “Legacy Method” on page 639. 

27 

— 

Reserved. 

26 

SSE2 

SSE2 instruction support. See Appendix D “CPUID Feature Sets” in APM3. 

25 

SSE 

SSE instruction support. See Appendix D “CPUID Feature Sets” in APM3 appendix 
and “64-Bit Media Programming” in APM1. 

24 

FXSR 

FXSAVE and FXRSTOR instructions. See “FXSAVE” and “FXRSTOR” in APM5. 

23 

MMX 

MMX™ instructions. See Appendix D “CPUID Feature Sets” in APM3 and “128-Bit 
Media and Scientific Programming” in APM1. 

22:20 

— 

Reserved. 

19 

CLFSH 

CLFLUSH instruction support. See “CLFLUSH” in APM3. 
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Bits 

Field Name 

Description 

18 

— 

Reserved. 

17 

PSE36 

Page-size extensions. The PDE[20:13] supplies physical address [39:32]. See 
“Page Translation and Protection” in APM2. 

16 

PAT 

Page attribute table. See “Page-Attribute Table Mechanism” in APM2. 

15 

CMOV 

Conditional move instructions. See “CMOV”, “FCMOV” in APM3. 

14 

MCA 

Machine check architecture. See “Machine Check Mechanism” in APM2. 

13 

PGE 

Page global extension. See “Page Translation and Protection” in APM2. 

12 

MTRR 

Memory-type range registers. See “Page Translation and Protection” in APM2. 

11 

SysEnterSysExit 

SYSENTER and SYSEXIT instructions. See “SYSENTER”, “SYSEXIT“ in APM3. 

10 

— 

Reserved. 

9 

APIC 

Avanced programmable interrupt controller. Indicates APIC exists and is enabled. 
See “Exceptions and Interrupts” in APM2. 

8 

CMPXCHG8B 

CMPXCHG8B instruction. See “CMPXCHG8B” in APM3. 

7 

MCE 

Machine check exception. See “Machine Check Mechanism” in APM2. 

6 

PAE 

Physical-address extensions. Indicates support for physical addresses 3 32b. 
Number of physical address bits above 32b is implementation specific. See “Page 
Translation and Protection” in APM2. 

5 

MSR 

AMD model-specific registers. Indicates support for AMD model-specific registers 
(MSRs), with RDMSR and WRMSR instructions. See “Model Specific Registers” in 
APM2. 

4 

TSC 

Time stamp counter. RDTSC and RDTSCP instruction support. See “Debug and 
Performance Resources” in APM2. 

3 

PSE 

Page-size extensions. See “Page Translation and Protection” in APM2. 

2 

DE 

Debugging extensions. See “Debug and Performance Resources” in APM2. 

1 

VME 

Virtual-mode enhancements. CR4.VME, CR4.PVI, software interrupt indirection, 
expansion of the TSS with the software, indirection bitmap, EFLAGS.VIF, 
EFLAGS.VIP. See “System Resources” in APM2. 

0 

FPU 

x87 floating point unit on-chip. See “x87 Floating Point Programming” in APM1. 


E.3.3 Functions 2h-4h—Reserved 
CPUID Fn0000_000[4:2] Reserved 

These function numbers are reserved. 


E.3.4 Function 5h—Monitor and MWait Features 

This function provides feature identifiers for the MONITOR and MWAIT instructions. For more 
infonnation see the description of the MONITOR instruction on page 392 and the MWAIT instruction 
on page 398. 
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CPUID FnOOOO 0005 EAX Monitor/MWait 


The value returned in EAX provides the following information: 


Bits 

Field Name 

Description 

31:16 

— 

Reserved. 

15:0 

MonLineSizeMin 

Smallest monitor-line size in bytes. 


CPUID FnOOOO 0005 EBX Monitor/MWait 


The value returned in EBX provides the following information: 


Bits 

Field Name 

Description 

31:16 

— 

Reserved. 

15:0 

MonLineSizeMax 

Largest monitor-line size in bytes. 


CPUID FnOOOO 0005 ECX Monitor/MWait 


The value returned in ECX provides the following information: 


Bits 

Field Name 

Description 

31:2 

— 

Reserved. 

1 

IBE 

Interrupt break-event. Indicates MWAIT can use ECX bit 0 to allow interrupts to 
cause an exit from the monitor event pending state, even if EFLAGS.IF=0. 

0 

EMX 

Enumerate MONITOR/MWAIT extensions: Indicates enumeration 
MONITOR/MWAIT extensions are supported. 


CPUID Fn0000_0005_EDX Monitor/MWait 

The value returned in EDX is undefined and is reserved. 

E.3.5 Function 6h—Power Management Related Features 

This function provides information about the local APIC timer timebase and the effective frequency 
interface for the processor. 

CPUID Fn0000_0006_EAX Local APIC Timer Invariance 

The value returned in EAX is undefined and is reserved. 
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Bits 

Field Name 

Description 

31:3 

— 

Reserved. 

2 

ARAT 

If set, indicates that the timebase for the local APIC timer is not affected by 
processor p-state. 

1:0 

— 

Reserved. 


CPUID Fn0000_0006_EBX Reserved 

The value returned in EBX is undefined and is reserved. 

CPUID Fn0000_0006_ECX Efffective Processor Frequency Interface 

The value returned in ECX indicates support of the processor effective frequency interface. For more 
infonnation on this feature, see "Determining Processor Effective Frequency" in APM2. 


Bits 

Field Name 

Description 

31:1 

— 

Reserved. 

0 

EffFreq 

Effective frequency interface support. If set, indicates presence of MSR0000 00E7 
(MPERF) and MSR0000_00E8 (APERF). 


CPUID FnOOOO 0006 EDX Reserved 


The value returned in EDX is undefined and is reserved. 

E.3.6 Function 7h—Structured Extended Feature Identifiers 


CPUID Fn0000_0007_EAX_x0 Structured Extended Feature Identifiers (ECX=0) 


Bits 

Field Name 

Description 

31:0 

MaxSubFn 

Returns the number of subfunctions supported. 


CPUID Fn0000_0007_EBX_x0 Structured Extended Feature Identifiers (ECX=0) 


Bits 

Field Name 

Description 

31:9 

— 

Reserved. 

8 

BMI2 

Bit manipulation group 2 instruction support. 

7 

SMEP 

Supervisor mode execution protection. 

6 

— 

Reserved. 

5 

AVX2 

AVX2 instruction subset support. 

4 

— 

Reserved. 

3 

BMI1 

Bit manipulation group 1 instruction support. 
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Bits 

Field Name 

Description 

2:1 

— 

Reserved. 

0 

FSGSBASE 

FS and GS base read write instruction support. 


CPUID Fn0000_0007_ECX_x0 Structured Extended Feature identifiers (ECX=0) 


Bits 

Field Name 

Description 

31:0 

— 

Reserved. 


CPUID Fn0000_0007_EDX_x0 Structured Extended Feature Identifiers (ECX=0) 


Bits 

Field Name 

Description 

31:0 

— 

Reserved. 


E.3.7 Functions 8h-Ch—Reserved 
CPUID Fn0000_000[C:8] Reserved 


These function numbers are reserved. 

E.3.8 Function Dh—Processor Extended State Enumeration 

The XSAVE / XRSTOR instructions are used to save and restore x87/MMX FPU and SSE processor 
state. These instructions allow processor state associated with specific architected features to be 
selectively saved and restored. This function provides information about extended state support and 
save area size requirements. 

The function has a number of subfunctions specified by the input value passed to the CPUID 
instruction in the ECX register. 

Subfunction 0 of FnOOOOOOOD 

Subfunction 0 provides infonnation about features within the extended processor state management 
architecture that are supported by the processor. 

CPUID Fn0000_000D_EAX_x0 Processor Extended State Enumeration (ECX=0) 

The value returned in EAX provides a bit mask specifying which of the features defined by the 
extended processor state architecture are supported by the processor. 


Bits 

Field Name 

Description 

31:0 

XFeatureSupportedMask[31:0] 

Reports the valid bit positions for the lower 32 bits of the 
XFeatureEnabledMask register. If a bit is set, the corresponding 
feature is supported. See “XSAVE/XRSTOR Instructions” in APM2. 
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CPUID FnOOOO_OOOD_EBX_xO Processor Extended State Enumeration (ECX=0) 


The value returned in EBX gives the save area size requirement in bytes based on the features 
currently enabled in the XFEATURE_ENABLED_MASK (XCRO). 


Bits 

Field Name 

Description 

31:0 

XFeatureEnabledSizeMax 

Size in bytes of XSAVE/XRSTOR area for the currently enabled features in 
XCRO. 


CPUID Fn0000_000D_ECX_x0 Processor Extended State Enumeration (ECX=0) 

The value returned in ECX gives the save area size requirement in bytes for all extended state 
management features supported by the processor (whether enabled or not). 


Bits 

Field Name 

Description 

31:0 

XFeatureSupportedSizeMax 

Size in bytes of XSAVE/XRSTOR area for all features that the core 
supports. See XFeatureEnabledSizeMax. 


CPUID Fn0000_000D_EDX_x0 Processor Extended State Enumeration (ECX=0) 

The value returned in EDX provides a bit mask specifying which of the features defined by the 
extended processor state architecture are supported by the processor. 


Bits 

Field Name 

Description 

31:0 

XFeatureSupportedMask[63:32] 

Reports the valid bit positions for the upper 32 bits of the 
XFeatureEnabledMask register. If a bit is set, the corresponding 
feature is supported. 


See “XSAVE/XRSTOR Instructions” in APM2 and reference pages for the individual instructions in 
APM4. 

Subfunction 1 of FnOOOOOOOD 

Subfunction 1 provides additional information about features within the extended processor state 
management architecture that are supported by the processor. 


CPUID Fn0000_000D_EAX_x1 Processor Extended State Enumeration (ECX=1) 


Bits 

Field Name 

Description 

31:1 


Reserved. 

0 

XSAVEOPT 

XSAVEOPT is available. 


CPUID Fn0000_000D_E[D,C,B]X_x1 Processor Extended State Enumeration (ECX=1) 
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The values returned in EBX, ECX, and EDX for subfunction 1 are undefined and are reserved. 


Subfunction 2 of FnOOOOOOOD 

Subfunction 2 provides infonnation about the size and offset of the 256-bit SSE vector floating point 
processor unit state save area. 

CPUID Fn0000_000D_EAX_x2 Processor Extended State Enumeration (ECX=2) 

The value returned in EAX provides information about the size of the 256-bit SSE vector floating 
point processor unit state save area. 


Bits 

Field Name 

Description 

31:0 

YmmSaveStateSize 

YMM state save size. The state save area size in bytes for The YMM registers. 


CPUID Fn0000_000D_EBX_x2 Processor Extended State Enumeration (ECX=2) 

The value returned in EBX provides information about the offset of the 256-bit SSE vector floating 
point processor unit state save area from the base of the extended state (XSAVE/XRSTOR) save area. 


Bits 

Field Name 

Description 

31:0 

YmmSaveStateOffset 

YMM state save offset. The offset in bytes from the base of the extended state 
save area of the YMM register state save area. 


CPUID Fn0000_000D_E[D,C]X_x2 Processor Extended State Enumeration (ECX=2) 

The values returned in ECX and EDX for subfunction 2 are undefined and are reserved. 

If CPUID FnOOOO OOOD is executed with a subfunction (passed in ECX) greater than 2 but less than 
3Eh, the instruction returns all zeros in the EAX, EBX, ECX, and EDX registers. 

Subfunction 3Eh of Fn0000_000D 

Subfunction 3Eh provides information about the size and offset of the Lightweight Profiling (LWP) 
unit state save area. 

CPUID Fn0000_000D_EAX_x3E Processor Extended State Enumeration (ECX=62) 

The value returned in EAX provides the size of the Lightweight Profiling (LWP) unit state save area. 


Bits 

Field Name 

Description 

31:0 

LwpSaveStateSize 

LWP state save area size. The size of the save area for LWP state in bytes. See 
“Lightweight Profiling” in APM2. 
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CPUID Fn0000_000D_EBX_x3E Processor Extended State Enumeration (ECX=62) 

The value returned in EBX provides the offset of the Lightweight Profiling (LWP) unit state save area 
from the base of the extended state (XSAVE/XRSTOR) save area. 


Bits 

Field Name 

Description 

31:0 

LwpSaveStateOffset 

LWP state save byte offset. The offset in bytes from the base of the extended 
state save area of the state save area for LWP. See “Lightweight Profiling” in 
APM2. 


CPUID Fn0000_000D_E[D,C]X_x3E Processor Extended State Enumeration (ECX=62) 

The values returned in ECX and EDX for subfunction 3Eh are undefined and are reserved. 

Subfunctions of FnOOOOOOOD greater than 3Eh 

For CPUID FnOOOO OOOD, if the subfunction (specified by contents of ECX) passed as input to the 
instruction is greater than 3Eh, the instruction returns zero in the EAX, EBX, ECX, and EDX registers. 

E.3.9 Functions 4000_0000h-4000_FFh—Reserved for Hypervisor Use 
CPUID Fn4000_00[FF:00] Reserved 

These function numbers are reserved for use by the virtual machine monitor. 

E.4 Extended Feature Function Numbers 

This section describes each of the defined CPUID functions in the extended range. 

E.4.1 Function 8000_0000h—Maximum Extended Function Number and Vendor 
String 

This function provides infonnation about the maximum extended function number supported on this 
processor and a string that identifies the vendor of the product. 


CPUID Fn8000_0000_EAX Largest Extended Function Number 

The value returned in EAX provides the largest extended function number supported by the processor. 


Bits 

Field Name 

Description 

31:0 

LFuncExt 

Largest extended function. The largest CPUID extended function input value 
supported by the processor implementation. 
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CPUID Fn8000 0000 E[D,C,B]X Processor Vendor 

The values returned in EBX, ECX, and EDX together provide a 12-character string identifying the 
vendor of this processor. The output string is the same as the one returned by FnOOOOOOOO. See 
CPUID Fn0000_0000_E[D,C,B]X on page 608 for more details. 


Bits 

Field Name 

Description 

31:0 

Vendor 

Four characters of the 12-byte character string (encoded in ASCII) 

“AuthenticAMD”. See Table E-2 below. 


Table E-2. CPUID Fn8000_0000_E[D,C,B]X values 


Register 

Value 

Description 

CPUID Fn8000 0000 EBX 

6874 7541h 

The ASCII characters “h t u A”. 

CPUID Fn8000 0000 ECX 

444D 4163h 

The ASCII characters “D M A c”. 

CPUID Fn8000_0000_EDX 

6974_6E65h 

The ASCII characters “i t n e”. 


E.4.2 Function 8000_0001h—Extended Processor and Processor Feature Identifiers 
CPUID Fn8000 0001 EAX AMD Family, Model, Stepping 

The value returned in EAX provides the family, model, and stepping identifiers. Three values are used 
by software to identify a processor: Family, Model, and Stepping. The value returned in EAX is the 
same as the value returned in EAX for FnOOOOOOOl. See CPUID FnOOOOOOOlEAX on page 609 
for more details on the field de fin itions. 


Bits 

Field Names 

Description 

31:0 

Family, Model, Stepping 

See: CPUID Fn0000_0001_EAX. 


CPUID Fn8000 0001 EBX Brandld Identifier 

The value returned in EBX provides package type and a 16-bit processor name string identifiers. 


Bits 

Field Name 

Description 

31:28 

PkgType 

Package type. If (Family[7:0] >= 10h), this field is valid. If (Family[7:0]<1 Oh), this 
field is reserved. 

27:16 

— 

Reserved. 

15:0 

Brandld 

Brand ID. This field, in conjunction with CPUID Fn0000_0001_EBX[8BitBrandld], is 
used by system firmware to generate the processor name string. See your 
processor revision guide for how to program the processor name string. 
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For processor families lOh and greater, PkgType is described in the BIOS and Kernel Developer s 
Guide for the product. 

CPUID Fn8000 0001 ECX Feature Identifiers 

This function contains the following miscellaneous feature identifiers: 


Bits 

Field Name 

Description 

31:28 

— 

Reserved. 

27 

PerfTsc 

Performance time-stamp counter. Indicates support for MSRC001_0280 
[Performance Time Stamp Counter]. 

26 

DataBreakpointEx 

tension 

Data access breakpoint extension. Indicates support for MSRC001 1027 and 
MSRC001_101[B:9], 

25 

— 

Reserved 

24 

PerfCtrExtNB 

NB performance counter extensions support. Indicates support for 
MSRC001_024[6,4,2,0] and MSRC001_024[7,5,3,1], 

23 

PerfCtrExtCore 

Processor performance counter extensions support. Indicates support for 
MSRC001_020[A,8,6,4,2,0] and MSRC001_020[B,9,7,5,3,1], 

22 

TopologyExtensio 

ns 

Topology extensions support. Indicates support for CPUID 

Fn8000_001 D_EAX_x[N:0]-CPUID Fn8000_001 E EDX. 

21 

TBM 

Trailing bit manipulation instruction support. 

20 

— 

Reserved. 

19 

— 

Reserved. 

18 

— 

Reserved. 

17 

— 

Reserved. 

16 

FMA4 

Four-operand FMA instruction support. 

15 

LWP 

Lightweight profiling support. See “Lightweight Profiling” in APM2 and reference 
pages for individual LWP instructions in APM3. 

14 

— 

Reserved. 

13 

WDT 

Watchdog timer support. See APM2 and APM3. Indicates support for 
MSRC001_0074. 

12 

SKINIT 

SKINIT and STGI are supported. Indicates support for SKINIT and STGI, 
independent of the value of MSRC000_0080[SVME]. See APM2 and APM3. 

11 

XOP 

Extended operation support. 

10 

IBS 

Instruction based sampling. See “Instruction Based Sampling” in APM2. 

9 

OSVW 

OS visible workaround. Indicates OS-visible workaround support. See “OS Visible 
Work-around (OSVW) Information” in APM2. 

8 

3DNowPrefetch 

PREFETCH and PREFETCHW instruction support. See “PREFETCH” and 
“PREFETCHW” in APM3. 

7 

MisAlignSse 

Misaligned SSE mode. See “Misaligned Access Support Added for SSE 
Instructions” in APM1. 

6 

SSE4A 

EXTRQ, INSERTQ, MOVNTSS, and MOVNTSD instruction support. See 
“EXTRQ”, “INSERTQ”, “MOVNTSS”, and “MOVNTSD” in APM4. 
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Bits 

Field Name 

Description 

5 

ABM 

Advanced bit manipulation. LZCNT instruction support. See “LZCNT” in APM3. 

4 

AltMovCr8 

LOCK MOV CRO means MOV CR8. See “MOV(CRn)” in APM3. 

3 

ExtApicSpace 

Extended APIC space. This bit indicates the presence of extended APIC register 
space starting at offset 400h from the “APIC Base Address Register,” as specified 
in the BKDG. 

2 

SVM 

Secure virtual machine. See “Secure Virtual Machine” in APM2. 

1 

CmpLegacy 

Core multi-processing legacy mode. See “Legacy Method” on page 639. 

0 

LahfSahf 

LAHF and SAHF instruction support in 64-bit mode. See “LAHF” and “SAHF” in 
APM3. 


CPUID Fn8000 0001 EDX Feature Identifiers 


This function contains the following miscellaneous feature identifiers: 


Bits 

Field Name 

Description 

31 

3DNow 

3DNow!™ instructions. See Appendix D “Instruction Subsets and CPUID Feature 
Sets” in APM3. 

30 

3DNowExt 

AMD extensions to 3DNow! instructions. See Appendix D “Instruction Subsets and 
CPUID Feature Sets” in APM3. 

29 

LM 

Long mode. See “Processor Initialization and Long-Mode Activation” in APM2. 

28 

— 

Reserved. 

27 

RDTSCP 

RDTSCP instruction. See “RDTSCP” in APM3. 

26 

PagelGB 

1-GB large page support. See “1-GB Paging Support” in APM2. 

25 

FFXSR 

FXSAVE and FXRSTOR instruction optimizations. See “FXSAVE” and “FXRSTOR” 
in APM5. 

24 

FXSR 

FXSAVE and FXRSTOR instructions. Same as CPUID Fn0000_0001_EDX[FXSR], 

23 

MMX 

MMX™ instructions. Same as CPUID Fn0000_0001_EDX[MMX], 

22 

MmxExt 

AMD extensions to MMX instructions. See Appendix D “Instruction Subsets and 
CPUID Feature Sets” in APM3 and “128-Bit Media and Scientific Programming” in 
APM1. 

21 

— 

Reserved. 

20 

NX 

No-execute page protection. See “Page Translation and Protection” in APM2. 

19:18 

— 

Reserved. 

17 

PSE36 

Page-size extensions. Same as CPUID Fn0000_0001_EDX[PSE36]. 

16 

PAT 

Page attribute table. Same as CPUID Fn0000_0001_EDX[PAT]. 

15 

CMOV 

Conditional move instructions. Same as CPUID FnOOOO_OOC)1_EDX[CMOV]. 

14 

MCA 

Machine check architecture. Same as CPUID Fn0000_0001_EDX[MCA]. 

13 

PGE 

Page global extension. Same as CPUID Fn0000_0001_EDX[PGE]. 

12 

MTRR 

Memory-type range registers. Same as CPUID Fn0000_0001_EDX[MTRR]. 

11 

SysCallSysRet 

SYSCALL and SYSRET instructions. See “SYSCALL” and “SYSRET” in APM3. 
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Bits 

Field Name 

Description 

10 

— 

Reserved. 

9 

APIC 

Advanced programmable interrupt controller. Same as CPUID 
Fn0000_0001_EDX[APIC], 

8 

CMPXCHG8B 

CMPXCHG8B instruction. Same as CPUID Fn0000_0001_EDX[CMPXCHG8B], 

7 

MCE 

Machine check exception. Same as CPUID Fn0000_0001_EDX[MCE]. 

6 

PAE 

Physical-address extensions. Same as CPUID Fn0000_0001_EDX[PAE]. 

5 

MSR 

AMD model-specific registers. Same as CPUID Fn0000_0001_EDX[MSR]. 

4 

TSC 

Time stamp counter. Same as CPUID Fn0000_0001_EDX[TSC]. 

3 

PSE 

Page-size extensions. Same as CPUID Fn0000_0001_EDX[PSE]. 

2 

DE 

Debugging extensions. Same as CPUID Fn0000_0001_EDX[DE]. 

1 

VME 

Virtual-mode enhancements. Same as CPUID Fn0000_0001_EDX[VME]. 

0 

FPU 

x87 floating-point unit on-chip. Same as CPUID Fn0000_0001_EDX[FPU]. 


E.4.3 Functions 8000_0002h-8000_0004h—Extended Processor Name String 

CPUID Fn8000 000[4:2]E[D,C,B,A]X Processor Name String Identifier 

The three extended functions from Fn8000_0002 to Fn8000_0004 are programmed to return a null 
terminated ASCII string up to 48 characters in length corresponding to the processor name. 


Bits 

Field Name 

Description 

31:0 

ProcName 

Four characters of the extended processor name string. 


The 48 character maximum includes the terminating null character. The 48 character string is ordered 
first to last (left to right) as follows: 

Fn8000_0002[EAX[7:0],..., EAX[31:24], EBX[7:0],..., EBX[31:24], ECX[7:0],..., 

ECX[31:24],EDX[7:0],..., EDX[31:24]], 

Fn8000_0003[EAX[7:0],..., EAX[31:24], EBX[7:0],..„ EBX[31:24], ECX[7:0],..„ ECX[31:24], 
EDX[7:0],...,EDX[31:24]], 

Fn8000_0004[EAX[7:0],..., EAX[31:24], EBX[7:0],..„ EBX[31:24], ECX[7:0],..„ ECX[31:24], 
EDX[7:0],...,EDX[31:24]]. 

The extended processor name string is programmed by system firmware. See your processor revision 
guide for information about how to display the extended processor name string. 

E.4.4 Function 8000_0005h—LI Cache and TLB Information 

This function provides first level cache TLB characteristics for the processor that executes the 
instruction. 

CPUID Fn8000 0005 EAX LI TLB 2M/4M Information 
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The value returned in EAX provides information about the Li TLB for 2-MB and 4-MB pages. 


Bits 

Field Name 

Description 

31:24 

LI DTIb2and4MAssoc 

Data TLB associativity for 2-MB and 4-MB pages. Encoding is per Table E-3 
below. 

23:16 

L1DTIb2and4MSize 

Data TLB number of entries for 2-MB and 4-MB pages. The value returned is for 
the number of entries available for the 2-MB page size; 4-MB pages require two 
2-MB entries, so the number of entries available for the 4-MB page size is one- 
half the returned value. 

15:8 

LI ITIb2and4MAssoc 

Instruction TLB associativity for 2-MB and 4-MB pages. Encoding is per 

Table E-3 below. 

7:0 

LI ITIb2and4MSize 

Instruction TLB number of entries for 2-MB and 4-MB pages. The value returned 
is for the number of entries available for the 2-MB page size; 4-MB pages 
require two 2-MB entries, so the number of entries available for the 4-MB page 
size is one-half the returned value. 


The associativity fields (LlDTlb2and4MAssoc and LlITlb2and4MAssoc) are encoded as follows: 

Table E-3. LI Cache and TLB Associativity Field Encodings 


Associativity 

[7:0] 

Definition 

OOh 

Reserved 

Olh 

1 way (direct mapped) 

02h-FEh 

»-way associative, (field encodes n ) 

FFh 

Fully associative 


CPUID Fn8000 0005 EBX LI TLB 4K Information 


The value returned in EBX provides information about the Li TLB for 4-KB pages. 


Bits 

Field Name 

Description 

31:24 

LI DTIb4KAssoc 

Data TLB associativity for 4 KB pages. Encoding is per Table E-3 above. 

23:16 

L1DTIb4KSize 

Data TLB number of entries for 4 KB pages. 

15:8 

LI ITIb4KAssoc 

Instruction TLB associativity for 4 KB pages. Encoding is per Table E-3 above. 

7:0 

LI ITIb4KSize 

Instruction TLB number of entries for 4 KB pages. 


The associativity fields (LlDTlb4KAssoc and LlITlb4KAssoc) are encoded as specified in Table E-3 
on page 623. 


CPUID Fn8000 0005 ECX LI Data Cache Information 


The value returned in ECX provides information about the first level data cache. 
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Bits 

Field Name 

Description 

31:24 

LIDcSize 

LI data cache size in KB. 

23:16 

LIDcAssoc 

LI data cache associativity. Encoding is per Table E-3. 

15:8 

LIDcLinesPerTag 

LI data cache lines per tag. 

7:0 

LIDcLineSize 

LI data cache line size in bytes. 


The associativity field (LlDc Assoc) is encoded as specified in Table E-3 on page 623. 


CPUID Fn8000 0005 EDX LI Instruction Cache Information 


The value returned in EDX provides information about the first level instruction cache. 


Bits 

Field Name 

Description 

31:24 

LI IcSize 

LI instruction cache size KB. 

23:16 

LI IcAssoc 

LI instruction cache associativity. Encoding is per Table E-3. 

15:8 

LI IcLinesPerTag 

LI instruction cache lines per tag. 

7:0 

LI IcLineSize 

LI instruction cache line size in bytes. 


The associativity field (LlIcAssoc) is encoded as specified in Table E-3 on page 623. 

E.4.5 Function 8000_0006h—L2 Cache and TLB and L3 Cache Information 

This function provides the second level cache and TLB characteristics for the processor that executes 
the instruction. The EDX register returns the processor’s third level cache characteristics that are 
shared by all cores of the processor. 

CPUID Fn8000_0006_EAX L2 TLB 2M/4M Information 

The value returned in EAX provides information about the L2 TLB for 2-MB and 4-MB pages. 


Bits 

Field Name 

Description 

31:28 

L2DTIb2and4MAssoc 

L2 data TLB associativity for 2-MB and 4-MB pages. Encoding is per 

Table E-4 below. 

27:16 

L2DTIb2and4MSize 

L2 data TLB number of entries for 2-MB and 4-MB pages. The value returned 
is for the number of entries available for the 2 MB page size; 4 MB pages 
require two 2 MB entries, so the number of entries available for the 4 MB page 
size is one-half the returned value. 

15:12 

L2ITIb2and4MAssoc 

L2 instruction TLB associativity for 2-MB and 4-MB pages. Encoding is per 
Table E-4 below. 

11:0 

L2ITIb2and4MSize 

L2 instruction TLB number of entries for 2-MB and 4-MB pages. The value 
returned is for the number of entries available for the 2 MB page size; 4 MB 
pages require two 2 MB entries, so the number of entries available for the 4 MB 
page size is one-half the returned value. 
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The associativity fields (L2DTlb2and4MAssoc and L2ITlb2and4MAssoc) are encoded as follows: 

Table E-4. L2/L3 Cache and TLB Associativity Field Encoding 


Associativity 

[3:0] 

Definition 

Oh 

L2/L3 cache or TLB is disabled. 

lh 

Direct mapped. 

2h 

2-way associative. 

3h 

3-way associative. 

4h 

4-way associative. 

5h 

6-way associative. 

6h 

8-way associative. 

8h 

16-way associative. 

9h 

Value for all fields should be determined from 
Fn8000_001D 

Ah 

32-way associative. 

Bh 

48-way associative. 

Ch 

64-way associative. 

Dh 

96-way associative. 

Eh 

128-way associative. 

Fh 

Fully associative. 

All other encodings are reserved. 


CPUID Fn8000_0006_EBX L2 TLB 4K Information 

The value returned in EBX provides information about the L2 TLB for 4-KB pages. 


Bits 

Field Name 

Description 

31:28 

L2DTIb4KAssoc 

L2 data TLB associativity for 4-KB pages. Encoding is per Table E-4 above. 

27:16 

L2DTIb4KSize 

L2 data TLB number of entries for 4-KB pages. 

15:12 

L2ITIb4KAssoc 

L2 instruction TLB associativity for 4-KB pages. Encoding is per Table E-4 above. 

11:0 

L2ITIb4KSize 

L2 instruction TLB number of entries for 4-KB pages. 


The associativity fields (L2DTlb4KAssoc and L2ITlb4KAssoc) are encoded per Table E-4 above. 


CPUID Fn8000 0006 ECX L2 Cache Information 


The value returned in ECX provides information about the L2 cache. 
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Bits 

Field Name 

Description 

31:16 

L2Size 

L2 cache size in KB. 

15:12 

L2Assoc 

L2 cache associativity. Encoding is per Table E-4 on page 625. 

11:8 

L2LinesPerTag 

L2 cache lines per tag. 

7:0 

L2LineSize 

L2 cache line size in bytes. 


The associativity field (L2Assoc) is encoded per Table E-4 on page 625. 


CPUID Fn8000_0006_EDX L3 Cache Information 

The value returned in EDX provides the third level cache characteristics shared by all cores of a 
physical processor. 


Bits 

Field Name 

Description 

31:18 

L3Size 

Specifies the L3 cache size range: 

(L3Size[31:18] * 512KB) < L3 cache size < ((L3Size[31:18]+1) * 512KB). 

17:16 

— 

Reserved. 

15:12 

L3Assoc 

L3 cache associativity. Encoded per Table E-4 on page 625. 

11:8 

L3LinesPerTag 

L3 cache lines per tag. 

7:0 

L3LineSize 

L3 cache line size in bytes. 


The associativity field (L3Assoc) is encoded per Table E-4 on page 625. 


E.4.6 Function 8000_0007h—Processor Power Management and RAS Capabilities 

This function provides infonnation about the power management, power reporting, and RAS 
capabilities of the processor that executes the instruction. There may be other processor-specific 
features and reporting capabilities not covered here. Refer to the BIOS and Kernel Developer s Guide 
for your specific product to otain more information. 

CPUID Fn8000 0007 EAX Reserved 


Bits 

Field Name 

Description 

31:0 

— 

Reserved. 


CPUID Fn8000_0007_EBX RAS Capabilities 

The value returned in EBX provides information about RAS features that allow system software to 
detect specific hardware errors. 
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Bits 

Field Name 

Description 

31:3 

— 

Reserved. 

2 

HWA 

Hardware assert supported. Indicates support for MSRC001_10[DF:C0]. 

1 

SUCCOR 

Software uncorrectable error containment and recovery capability. 

The processor supports software containment of uncorrectable errors through 
context synchronizing data poisoning and deferred error interrupts; see APM2, 
Chapter 9, “Determining Machine-Check Architecture Support.” 

0 

McaOverflowRecov 

MCA overflow recovery support. If set, indicates that MCA overflow conditions 
(MCi_STATUS[Overflow]=1) are not fatal; software may safely ignore such 
conditions. If clear, MCA overflow conditions require software to shut down the 
system. SeeAPM2, Chapter 9, “Handling Machine Check Exceptions.” 


CPUID Fn8000 0007 ECX Processor Power Monitoring Interface 

The value returned in ECX provides information about the implementation of the processor power 
monitoring interface. 


Bits 

Field Name 

Description 

31:0 

CpuPwrSampleTimeRatio 

Specifies the ratio of the compute unit power accumulator sample 
period to the TSC counter period. 


CPUID Fn8000 0007 EDX Advanced Power Management Features 

The value returned in EDX provides information about the advanced power management and power 
reporting features available. Refer to the BIOS and Kernel Developer s Guide for your specific product 
for a detailed description of the definition of each power management feature. 


Bits 

Field Name 

Description 

31:13 

— 

Reserved. 

12 

ProcPowerReporting 

Core power reporting interface supported. 

11 

ProcFeedbacklnterface 

Processor feedback interface. Value: 1. vindicates support for processor 
feedback interface. Note: This feature is deprecated. 

10 

EffFreqRO 

Read-only effective frequency interface. Vindicates presence of 

MSRC000 00E7 [Read-Only Max Performance Frequency Clock Count 
(MPerfReadOnly)] and MSRC000_00E8 [Read-Only Actual Performance 
Frequency Clock Count (APerfReadOnly)]. 

9 

CPB 

Core performance boost. 

8 

Tsclnvariant 

TSC invariant. The TSC rate is ensured to be invariant across all P-States, C- 
States, and stop grant transitions (such as STPCLK Throttling); therefore the 
TSC is suitable for use as a source of time. 0 = No such guarantee is made 
and software should avoid attempting to use the TSC as a source of time. 

7 

HwPstate 

Hardware P-state control. MSRC001_0061 [P-state Current Limit], 
MSRC001_0062 [P-state Control] and MSRC001_0063 [P-state Status] exist. 
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6 

lOOMHzSteps 

100 MHz multiplier Control. 

5 

— 

Reserved. 

4 

TM 

Hardware thermal control (HTC). 

3 

TTP 

THERMTRIP. 

2 

VID 

Voltage ID control. Function replaced by HwPstate. 

1 

FID 

Frequency ID control. Function replaced by HwPstate. 

0 

TS 

Temperature sensor. 


E.4.7 Function 8000_0008h—Processor Capacity Parameters and Extended Feature 
Identification 

This function provides the size or capacity of various architectural parameters that vary by 
implementation, as well as an extension to the Fn8000_0001 feature identifiers. 

CPUID Fn8000_0008_EAX Long Mode Size Identifiers 

The value returned in EAX provides information about the maximum host and guest physical and 
linear address width (in bits) supported by the processor. 


Bits 

Field Name 

Description 

31:24 

— 

Reserved. 

23:16 

GuestPhysAddrSize 

Maximum guest physical byte address size in bits. This number applies only to 
guests using nested paging. When this field is zero, refer to the PhysAddrSize 
field for the maximum guest physical address size. See “Secure Virtual Machine” 
in APM2. 

15:8 

LinAddrSize 

Maximum linear byte address size in bits. 

7:0 

PhysAddrSize 

Maximum physical byte address size in bits. When GuestPhysAddrSize is zero, 
this field also indicates the maximum guest physical address size. 


The address width reported is the maximum supported in any mode. For long mode capable proces¬ 
sors, the size reported is independent of whether long mode is enabled. See “Processor Initialization 
and Long-Mode Activation” in APM2. 


CPUID Fn8000_0008_EBX Extended Feature Identifiers 

The value returned in EBX is an extension to the Fn8000_0001 feature flags and indicates the presence 
of various ISA extensions. 


Bit 

Field Name 

Description 

0 

CLZERO 

CLZERO instruction supported 

1 

InstRetCntMsr 

Instruction Retired Counter MSR available 

2 

RstrFpErrPtrs 

FP Error Pointers Restored by XRSTOR 
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CPUID Fn8000 0008 ECX Size Identifiers 

The value returned in ECX provides information about the number of cores supported by the 
processor, the width of the APIC ID, and the width of the performance time-stamp counter. 


Bits 

Field Name 

Description 

31:16 

— 

Reserved. 

17:16 

PerfTscSize 

Performance time-stamp counter size. Indicates the size of 

MSRC001_0280[PTSC], 

Bits Description 

00b 40 bits 

01b 48 bits 

10b 56 bits 

11b 64 bits 



APIC ID size. The number of bits in the initial APIC20[Apicld] value that indicate 
core ID within a processor. A zero value indicates that legacy methods must be 
used to derive the maximum number of cores. The size of this field determines the 
maximum number of cores (MNC) that the processor could theoretically support, 
not the actual number of cores that are actually implemented or enabled on the 
processor, as indicated by CPUID Fn8000_0008_ECX[NC]. 

15:12 

ApicldCoreldSize 

if (ApicldCoreldSize[3:0] == 0){ 

// Used by legacy dual-core/single-core processors 

MNC = CPUID Fn8000_0008_ECX[NC] + 1; 

} else { 

// use ApicldCoreldSize[3:0] field 

MNC = (2 A ApicldCoreldSize[3:0]); 

} 

11:8 

— 

Reserved. 

7:0 

NC 

Number of physical cores - 1. The number of cores in the processor is NC+1 (e.g., 
if NC = 0, then there is one core). See “Legacy Method” on page 639. 


CPUID Fn8000 0008 EDX Reserved 

The value returned in EDX for this function is undefined and is reserved. 


E.4.8 Function 8000_0009h—Reserved 
CPUID Fn8000 0009 Reserved 

This function is reserved. 


Obtaining Processor Information Via the CPUID Instruction 


629 















AMpg 

AMD64 Technology 


24594 — Rev. 3.28—September 2019 


E.4.9 Function 8000_000Ah—SVM Features 

This function provides information about the SVM features that the processory supports. If SVM is 
not supported (CPUID Fn8000_0001_ECX[SVM] = 0), this function is reserved. 

CPUID Fn8000 000A EAX SVM Revision and Feature Identification 

The value returned in EAX provides the SVM revision number. I 


Bits 

Field Name 

Description 

31:8 

— 

Reserved. 

7:0 

SvmRev 

SVM revision number. 


CPUID Fn8000 000A EBX SVM Revision and Feature Identification 

The value returned in EBX provides the number of address space identifiers (ASIDs) that the 
processor supports. 


Bits 

Field Name 

Description 

31:0 

NAS ID 

Number of available address space identifiers (ASID). 


CPUID Fn8000 000A ECX Reserved 

The value returned in ECX for this function is undefined and is reserved. 

CPUID Fn8000 000A EDX SVM Feature Identification 

The value returned in EDX provides Secure Virtual Machine architecture feature information. All 
cross references in the table below are to sections within the Secure Virtual Machine chapter of APM2. 


Bits 

Field Name 

Description 

31:17 

— 

Reserved. 

16 

VGIF 

Virtualize the Global Interrupt Flag.. See "Nested Virtualization" 

15 

VMSAVEvirt 

VMSAVE and VMLOAD virtualization. See "Nested Virtualization" 

14 

— 

Reserved 

13 

AVIC 

Support for the AMD advanced virtual interrupt controller. See “Advanced 

Virtual Interrupt Controller.” 

12 

PauseFilterThreshold 

PAUSE filter threshold. Indicates support for the PAUSE filter cycle count 
threshold. See "Pause Intercept Filtering” in Volume 2. 

11 

— 

Reserved. 

10 

PauseFilter 

Pause intercept filter. Indicates support for the pause intercept filter. See “Pause 
Intercept Filtering.” 

9:8 

— 

Reserved. 
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Bits 

Field Name 

Description 

7 

DecodeAssists 

Decode assists. Indicates support for the decode assists. See “DecodeAssists.” 

6 

FlushByAsid 

Flush by ASID. Indicates that TLB flush events, including CR3 writes and 
CR4.PGE toggles, flush only the current ASID's TLB entries. Also indicates 
support for the extended VMCB TLB Control. See “TLB Control.” 

5 

VmcbClean 

VMCB clean bits. Indicates support for VMCB clean bits. See “VMCB Clean 
Bits.” 

4 

TscRateMsr 

MSR based TSC rate control. Indicates support for MSR TSC ratio 
MSRC000_0104. See “TSC Ratio MSR (C000_0104h).” 

3 

NRIPS 

NRIP save. Indicates support for NRIP save on #VMEXIT. See “State Saved on 
Exit.” 

2 

SVML 

SVM lock. Indicates support for SVM-Lock. See “Enabling SVM.” 

1 

LbrVirt 

LBR virtualization. Indicates support for LBR Virtualization. See “Enabling LBR 
Virtualization.” 

0 

NP 

Nested paging. Indicates support for nested paging. See “Nested Paging.” 


E.4.10 Functions 8000_000Bh-8000_0018h—Reserved 
CPUID Fn8000_00[18:0B] Reserved 

These functions are reserved. 

E.4.11 Function 8000_0019h—TLB Characteristics for 1GB pages 

This function provides infonnation about the TLB for 1 GB pages for the processor that executes the 
instruction. 

CPUID Fn8000_0019_EAX LI TLB 1G Information 

The value returned in EAX provides information about the LI TLB for 1 GB pages. 


Bits 

Field Name 

Description 

31:28 

LIDTIbIGAssoc 

LI data TLB associativity for 1 GB pages. See Table E-4 on page 625. 

27:16 

LIDTIbIGSize 

LI data TLB number of entries for 1 GB pages. 

15:12 

LI ITIbIGAssoc 

LI instruction TLB associativity for 1 GB pages. See Table E-4 on page 625. 

11:0 

LI ITIbIGSize 

LI instruction TLB number of entries for 1 GB pages. 


CPUID Fn8000 0019 EBX L2 TLB 1G Information 


The value returned in EBX provides information about the L2 TLB for 1 GB pages. 
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Bits 

Field Name 

Description 

31:28 

L2DTIb1GAssoc 

L2 data TLB associativity for 1 GB pages. See Table E-4 on page 625. 

27:16 

L2DTIb1GSize 

L2 data TLB number of entries for 1 GB pages. 

15:12 

L2ITIb1GAssoc 

L2 instruction TLB associativity for 1 GB pages. See Table E-4 on page 625. 

11:0 

L2ITIb1GSize 

L2 instruction TLB number of entries for 1 GB pages. 


CPUID Fn8000_0019_E[D,C]X Reserved 

The values returned in ECX and EDX for this function are undefined and reserved for future use. 


E.4.12 Function 8000_001Ah—Instruction Optimizations 


CPUID Fn8000_001A_EAX Performance Optimization Identifiers 

This function returns performance related information. For more details on how to use these bits to optimize 
software, see the Software Optimization Guide applicable to your product. 


Bits 

Field Name 

Description 

31:3 

— 

Reserved. 

2 

FP256 

256-bit AVX instructions are executed with full-width internal operations and 
pipelines rather than decomposing them into internal 128-bit suboperations. This 
may impact how software performs instruction selection and scheduling. 

1 

MOVU 

MOVU SSE nstructions are more efficient and should be preferred to SSE 
MOVL/MOVH. MOVUPS is more efficient than MOVLPS/MOVHPS. MOVUPD is 
more efficient than MOVLPD/MOVHPD. 

0 

FP128 

128-bit SSE (multimedia) instructions are executed with full-width internal 
operations and pipelines rather than decomposing them into internal 64-bit 
suboperations. This may impact how software performs instruction selection and 
scheduling. 


CPUID Fn8000_001A_E[D,C,B]X Reserved 

The values returned in EBX, ECX, and EDX are undefined for this function and are reserved. 
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E.4.13 Function 8000_001Bh—Instruction-Based Sampling Capabilities 

If instruction-based sampling (IBS) is supported (CPUID Fn8000_0001_ECX[IBS] = 1), this CPUID 
function can be used to obtain IBS feature information. If IBS is not supported (CPUID 
Fn8000_0001_ECX[IBS] = 0), this function number is reserved. For more information on using IBS, 
see “Instruction-Based Sampling” in APM2. 

CPUID Fn8000 001B EAX Instruction-Based Sampling Feature Indicators 

The value returned in EAX provides the following information about the specific features of IBS that 
the processor supports: 


Bits 

Field Name 

Description 

31:9 


Reserved. 

8 

OpBrnFuse 

Fused branch micro-op indication supported. 

7 

RipInvalidChk 

Invalid RIP indication supported. 

6 

OpCntExt 

IbsOpCurCnt and IbsOpMaxCnt extend by 7 bits. 

5 

BrnTrgt 

Branch target address reporting supported. 

4 

OpCnt 

Op counting mode supported. 

3 

RdWrOpCnt 

Read write of op counter supported. 

2 

OpSam 

IBS execution sampling supported. 

1 

FetchSam 

IBS fetch sampling supported. 

0 

IBSFFV 

IBS feature flags valid. 


CPUID Fn8000 001 B E[D,C,B]X Reserved 

The values returned in EBX, ECX, and EDX are undefined and are reserved. 


E.4.14 Function 8000_001Ch—Lightweight Profiling Capabilities 

If lightweight profilling (LWP) is supported (CPUID Fn8000_0001_ECX[LWP] = 1), this CPUID 
function can be used to obtain information about LWP features supported by the processor. If LWP is 
not supported (CPUID Fn8000_0001_ECX[LWP] = 0), this function number is reserved. For more 
infonnation on using LWP, see “Lightweight Profiling” in APM2. 
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CPUID Fn8000_001C_EAX Lightweight Profiling Capabilities 0 

The value returned in EAX provides the following information about LWP capabilities supported by 
the processor: 


Bits 

Field Name 

Description 

31 

Lwplnt 

Interrupt on threshold overflow available. 

30 

LwpPTSC 

Performance time stamp counter in event record is available. 

29 

LwpCont 

Sampling in continuous mode is available. 

28:7 

— 

Reserved. 

6 

LwpRNH 

Core reference clocks not halted event available. 

5 

LwpCNH 

Core clocks not halted event available. 

4 

LwpDME 

DC miss event available. 

3 

LwpBRE 

Branch retired event available. 

2 

LwpIRE 

Instructions retired event available. 

1 

LwpVAL 

LWPVAL instruction available. 

0 

LwpAvail 

The LWP feature is available. 


CPUID Fn8000_001C_EBX Lightweight Profiling Capabilities 0 

The value returned in EBX provides the following additional information about LWP capabilities 
supported by the processor: 


Bits 

Field Name 

Description 

31:24 

LwpEventOffset 

Offset in bytes from the start of the LWPCB to the Eventlntervall field. 

23:16 

LwpMaxEvents 

Maximum Eventld value supported. 

15:8 

LwpEventSize 

Event record size. Size in bytes of an event record in the LWP event ring buffer. 

7:0 

LwpCbSize 

Control block size. Size in quadwords of the LWPCB. 


CPUID Fn8000_001C_ECX Lightweight Profiling Capabilities 0 

The value returned in ECX provides the following additional information about LWP capabilities 
supported by the processor: 


Bits 

Field Name 

Description 

31 

LwpCacheLatency 

Cache latency filtering supported. Cache-related events can be filtered by latency. 

30 

LwpCacheLevels 

Cache level filtering supported. Cache-related events can be filtered by the cache 
level that returned the data. 

29 

LwpIpFiltering 

IP filtering supported. 

28 

LwpBranchPredict 

ion 

Branch prediction filtering supported. Branches Retired events can be filtered 
based on whether the branch was predicted properly. 
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Bits 

Field Name 

Description 

27:24 

— 

Reserved. 

23:16 

LwpMinBufferSize 

Event ring buffer size. Minimum size of the LWP event ring buffer, in units of 32 
event records. 

15:9 

LwpVersion 

Version of LWP implementation. 

8:6 

LwpLatencyRnd 

Amount by which cache latency is rounded. 

5 

LwpDataAddress 

Data cache miss address valid. Address is valid for cache miss event records. 

4:0 

LwpLatencyMax 

Latency counter size. Size in bits of the cache latency counters. 


CPUID Fn8000 001C EDX Lightweight Profiling Capabilities 0 

The value returned in EDX provides the following additional information about LWP capabilities 
supported by the processor: 


Bits 

Field Name 

Description 

31 

Lwplnt 

Interrupt on threshold overflow supported. 

30 

LwpPTSC 

Performance time stamp counter in event record is supported. 

29 

LwpCont 

Sampling in continuous mode is supported. 

28:7 

— 

Reserved. 

6 

LwpRNH 

Core reference clocks not halted event is supported. 

5 

LwpCNH 

Core clocks not halted event is supported. 

4 

LwpDME 

DC miss event is supported. 

3 

LwpBRE 

Branch retired event is supported. 

2 

LwpIRE 

Instructions retired event is supported. 

1 

LwpVAL 

LWPVAL instruction is supported. 

0 

Lwp Avail 

Lightweight profiling is supported. 


E.4.15 Function 8000_001Dh—Cache Topology Information 

CPUID Fn8000_001D_E[D,C,B,A]X reports cache topology information for the cache enumerated by 
the value passed to the instruction in ECX, referred to as Cache n in the following description. To 
gather information for all cache levels, software must repeatedly execute CPUID with 8000_001Dh in 
EAX and ECX set to increasing values beginning with 0 until a value of OOh is returned in the field 
CacheType (EAX[4:0]) indicating no more cache descriptions are available for this processor. 

If CPUID Fn8000_0001_ECX[TopologyExtensions] = 0, then CPUID Fn8000_001Dh is reserved. 

| Any value in ECX which does not select an existing cache will return a Null cache type in EAX[4:0]. 


CPUID Fn8000 001 D EAX x[N:0] Cache Properties 
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Bits 

Field Name 

Description 

31:26 

— 

Reserved. 

25:14 

NumSharingCache 

Specifies the number of cores sharing the cache enumerated by n, the value 
passed to the instruction in ECX. The number of cores sharing this cache is the 
value of this field incremented by 1. 

13:10 

— 

Reserved. 

9 

FullyAssociative 

Fully associative cache. When set, indicates that the cache is fully associative. If 

0 is returned in this field, the cache is set associative. 

8 

Selfinitialization 

Self-initializing cache. When set, indicates that the cache is self initializing; 
software initialization not required. If 0 is returned in this field, hardware does not 
initialize this cache. 

7:5 

CacheLevel 

Cache level. Identifies the level of this cache. Note that the enumeration value is 
not necessarily equal to the cache level. 

Bits Description 

000b Reserved. 

001b Level 1 

010b Level 2 

011b Level 3 

Illb-IOOb Reserved. 

4:0 

CacheType 

Cache type. Identifies the type of cache. 

Bits Description 

OOh Null; no more caches. 

01 h Data cache 

02h Instruction cache 

03h Unified cache 

1Fh-04h Reserved. 


CPUID Fn8000_001D_EBX_x[N:0] Cache Properties 

See CPUID Fn8000_001D_EAX_x[N:0]. 


Bits 

Field Name 

Description 

31:22 

CacheNumWays 

Number of ways for this cache. The number of ways is the value returned in this 
field incremented by 1. 

21:12 

CachePhysPartitions 

Number of physical line partitions. The number of physical line partitions is the 
value returned in this field incremented by 1. 

11:0 

CacheLineSize 

Cache line size. The cache line size in bytes is the value returned in this field 
incremented by 1. 


CPUID Fn8000_001D_ECX_x[N:0] Cache Properties 

See CPUID Fn8000_001D_EAX_x[N:0]. 
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Bits 

Field Name 

Description 

31:0 

CacheNumSets 

Number of ways for set associative cache. Number of ways is the value returned in 
this field incremented by 1. Only valid for caches that are not fully associative 
(Fn8000_001D_EAX_xn[FullyAssociative] = 0). 


CPUID Fn8000_001D_EDX_x[N:0] Cache Properties 

See CPUID Fn8000_001D_EAX_x[N:0]. 


Bits 

Field Name 

Description 

31:2 

— 

Reserved. 

1 

Cachelnclusive 

Cache inclusivity. A value of 0 indicates that this cache is not inclusive of lower 
cache levels. A value of 1 indicates that the cache is inclusive of lower cache 
levels. 

0 

WBINVD 

Write-Back Invalidate/Invalidate execution scope. A value of 0 returned in this field 
indicates that the WBINVD/INVD instruction invalidates all lower level caches of 
non-originating cores sharing this cache. When set, this field indicates that the 
WBINVD/INVD instruction is not guaranteed to invalidate all lower level caches of 
non-originating cores sharing this cache. 


E.4.16 Function 8000_001Eh—Processor Topology Information 
CPUID Fn8000 001 E EAX Extended APIC ID 

If CPUID Fn8000_0001_ECX[Topology Extensions] = 0, this function number is reserved. 


Bits 

Field Name 

Description 

31:0 

ExtendedApicId 

Extended APIC ID. If MSR0000_001B[ApicEn] = 0, this field is reserved.. 


CPUID Fn8000 001E EBX Compute Unit Identifiers 

See CPUID Fn8000 001E EAX. 


Bits 

Field Name 

Description 

31:16 

— 

Reserved. 

15:8 

ThreadsPerComputeUnit 

Threads per compute unit (zero-based count). The actual number of cores 
per compute unit is the value of this field + 1. 

7:0 

Computellnitld 

Compute unit ID. Identifies the processor compute unit ID. 
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CPUID Fn8000_001 E_ECX Node Identifiers 

See CPUID Fn8000 001E EAX. 


Bits 

Field Name 

Description 

31:0 

— 

Reserved. 

10:8 

NodesPerProcessor 

Specifies the number of nodes in the processor (package/socket) in which this 
core resides. Node in this context corresponds to a processor die. Encoding is 
N-1, where N is the number of nodes present in the socket. 

7:0 

Nodeld 

Specifies the ID of the node containing the current core. Nodeld values are 
unique across the system.. 


E.4.17 CPUID Fn8000_001f—Encrypted Memory Capabilities 
CPUID Fn8000 001F EAX 


Bits 

Field Name 

Description 

0 

SME 

Secure Memory Encryption supported 

1 

SEV 

Secure Encrypted Virtualization supported 

2 

PageFlushMsr 

Page Flush MSR available 

3 

SEV-ES 

SEV Encrypted State supported 


CPUID Fn8000 001F EBX 


Bits 

Field Name 

Description 

11:6 

PhysAddrReduction 

Physical Address bit reduction 

5:0 

CbitPosition 

C-bit location in page table entry 


CPUID Fn8000 001F ECX 


Bits 

Field Name 

Description 

31:0 

NumEncryptedGuests 

Number of encrypted guests supported simultaneously 


CPUID Fn8000 001F EDX 


Bits 

Field Name 

Description 

31:0 

MinSevNoEsAsid 

Minimum ASID value for an SEV enabled, SEV-ES disabled guest 
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E.5 Multiple Core Calculation 

Operating systems use one of two possible methods to calculate the number of cores per processor (NC), and 
the maximum number of cores per processor (MNC). The extended method is recommended, but a legacy 
method is also available for existing operating systems. 

E.5.1 Legacy Method 

The CPUID identification of total number of cores per processor (c) is derived from information returned by 
the following fields: 

• CPUID Fn0000_0001_EBX[LogicalProcessorCount] 

• CPUID Fn0000_0001_EDX[HTT] (Hyper-Threading Technology) 

• CPUID Fn8000_000 l_ECX[CmpLegacy] 

• CPUID Fn8000_0008_ECX[NC] (number of cores - 1) 

Table E-5 defines LogicalProcessorCount, HTT, CmpLegacy, and NC as a function of the number of 
cores per processor (c). 

When HTT = 0, LogicalProcessorCount is reserved and the processor contains one core. 

When HTT = 1 and CmpLegacy = 1, LogicalProcessorCount represents the number of cores per processor (c). 


Table E-5. LogicalProcessorCount, CmpLegacy, HTT, and NC 


Cores 

per 

Processor 

(c) 

CmpLegacy 

HTT 

LogicalProcessorCount 

NC 

1 

0 

0 

Reserved 

0 

2 or more 

1 

1 

c 

c-1 


The use of CmpLegacy and LogicalProcessorCount for the determination of the number of cores is deprecated. 
Instead, use NC to determine the number of cores. 

E.5.2 Extended Method (Recommended) 

The CPUID identification of total number of cores per processor is derived from information returned by the 
CPUID Fn8000_0008_ECX[ApicIdCoreIdSize[3:0]]. This field indicates the number of least significant bits in 
the CPUID Fn0000_0001_EBX[LocalApicId] that indicates core ID within the processor. The size of this field 
determines the maximum number of cores (MNC) that the processor could theoretically support, not the actual 
number of cores that are actually implemented or enabled on the processor, as indicated by CPUID 
Fn8000_0008_ECX[NC]. 

A value of zero for ApicIdCoreIdSize[3:0] indicates that the legacy method (section 2.1) should be used to 
derive the maximum number of cores: 

MNC = CPUID Fn8000_0008_ECX[NC] + 1. 
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For non-zero values of ApicldCoreIdSize[3:0], 

MNC = (2 A ApicIdCoreIdSize[3:0]) 

APIC Enumeration Requirements. System hardware and system firmware must ensure that the max¬ 
imum number of cores per processor (MNC) exposed to the operating system across all cores and processors in 
the system is identical. 

Local Apicld MNC rule: The Apicld of core j on processor node i must be enumerated/assigned as: 
LocalApicId[proc=i, core=j] = (OFFSETIDX + i) * MNC + j 

Where "OFFSET IDX" is an integer offset (0 to N) used to shift up the core LocalApicId values to allow room 
for IOAPIC devices. This assignment allows software to use a simple bitmask in addressing all the cores of a 
single processor. (The assignment also has the effect of reserving some IDs from use to ensure alignment of the 
ID of core 0 on each processor.) 

For example, consider a 3-processor system where: 

processor 0 has 4 cores 

processor 1 has 1 core 

processor 2 has 2 cores 

there are 8 IOAPIC devices 

cpuid.core_id_bits =2 for all cases, so MNC=4 

The LocalApicId and IOAPIC ID spaces cannot be disjointed and must be enumerated in the same ID space in 
order to support legacy operating systems. Each core can support an 8-bit Apicld. But if each IOAPIC device 
supports only a 4-bit IOAPIC ID, then the problem can be solved by shifting the LocalApicId space to start at 
some integer multiple of MNC, such as offset 8 (MNC = 4; OFFSET_IDX=2): 

LocalApieId[proc=0,core=0] = (2+0)*4 + 0 = 0x08 
LocalApieId[proc=0,core=l] = (2+0)*4 + 1 = 0x09 
LocalApicId[proc=0,core=2] = (2+0)*4 + 2 = OxOA 
LocalApicId[proc=0,core=3] = (2+0)*4 + 3 = OxOB 
LocalApieId[proc=l,core=0] = (2+l)*4 + 0 = OxOC 
LocalApicId OxD to OxF are reserved 

LocalApicId[proc=2,core=0] = (2+2)*4 + 0 = 0x10 
LocalApicId[proc=2,core=l] = (2+2)*4 + 1 = 0x11 
LocalApicId 0x12 and 0x13 are reserved 


It is recommended that system firmware use the following LocalApicId assignments for the broadest operating 
system support. Given N = (Number Of Processors * MNC) and M = NumberOflOAPICs: 

• If (N+M) <16, assign the LocalApicIds for the cores first from 0 to N-l, and the IOAPIC IDs from N to 
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N+(M-1). 

• If (N+M) > =16, assign the IOAPIC IDs first from 0 to M-l, and the LocalApicIds for the cores from K to 
K+(N-1), where K is an integer multiple of MNC greater than M-l. 
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Appendix F Instruction Effects on RFLAGS 


The flags in the RFLAGS register are described in “Flags Register” in Volume 1 and “RFLAGS 
Register” in Volume 2. Table F-1 summarizes the effect that instructions have on these flags. The table 
includes all instructions that affect the flags. Instructions not shown have no effect on RFLAGS. 

The following codes are used within the table: 

• 0—The flag is always cleared to 0. 

• 1—The flag is always set to 1. 

• AH—The flag is loaded with value from AH register. 

• Mod—The flag is modified, depending on the results of the instruction. 

• Pop—The flag is loaded with value popped off of the stack. 

• Tst—The flag is tested. 

• U—The effect on the flag is undefined. 

• Gray shaded cells indicate that the flag is not affected by the instruction. 


Table F-1. Instruction Effects on RFLAGS 


Instruction 

Mnemonic 

RFLAGS Mnemonic and Bit Number 

ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

AAA 

AAS 









U 




U 

U 

Tst 

Mod 

U 

Mod 

AAD 

AAM 









u 




Mod 

Mod 

U 

Mod 

U 

ADC 









Mod 




Mod 

Mod 

Mod 

Mod 

Tst 

Mod 

ADD 









Mod 




Mod 

Mod 

Mod 

Mod 

Mod 

AND 









0 




Mod 

Mod 

U 

Mod 

0 

ARPL 














Mod 




BSF 

BSR 









U 




U 

Mod 

U 

U 

U 

BT 


















BTC 

BTR 









U 




U 

U 

U 

U 

Mod 

BTS 


















BZHI 









0 




Mod 

Mod 

U 

U 

Mod 

CLC 

















0 

CLD 










0 








CLI 



Mod 





TST 



Mod 







CMC 

















Mod 

CMOVcc 









Tst 




Tst 

Tst 


Tst 

Tst 

CMP 









Mod 




Mod 

Mod 

Mod 

Mod 

Mod 
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Table F-1. Instruction Effects on RFLAGS (continued) 


Instruction 

Mnemonic 

RFLAGS Mnemonic and Bit Number 

ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

CMPSx 









Mod 

Tst 



Mod 

Mod 

Mod 

Mod 

Mod 

CMPXCHG 









Mod 




Mod 

Mod 

Mod 

Mod 

Mod 

CMPXCHG8B 














Mod 




CMPXCHG16B 














Mod 




COMISD 

COMISS 









0 




0 

Mod 

0 

Mod 

Mod 

DAA 

DAS 









U 




Mod 

Mod 

Tst 

Mod 

Mod 

Tst 

Mod 

DEC 









Mod 




Mod 

Mod 

Mod 

Mod 


DIV 









U 




U 

U 

U 

U 

U 

FCMOVcc 














Tst 


Tst 

Tst 

FCOMI 

FCOMIP 

FUCOMI 

FUCOMIP 














Mod 


Mod 

Mod 

IDIV 









U 




U 

U 

U 

U 

U 

IMUL 









Mod 




U 

U 

U 

U 

Mod 

INC 









Mod 




Mod 

Mod 

Mod 

Mod 


IN 








Tst 










INSx 








Tst 


Tst 








INT 

INT 3 



Mod 

Mod 

Tst 

Mod 

0 

Mod 

Tst 



Mod 

0 






INTO 




Mod 

Tst 

Mod 

0 

Mod 

Tst 

Tst 


Mod 

Mod 






IRETx 

Pop 

Pop 

Pop 

Pop 

Tst 

Pop 

Pop 

Tst 

Pop 

Tst 

Pop 

Pop 

Pop 

Pop 

Pop 

Pop 

Pop 

Pop 

Pop 

Pop 

Jcc 









Tst 




Tst 

Tst 


Tst 

Tst 

LAR 














Mod 




LODSx 










Tst 








LOOPE 

LOOPNE 














Tst 




LSL 














Mod 




LZCNT 









U 




U 

Mod 

U 

U 

Mod 

MOVSx 










Tst 








MUL 









Mod 




U 

U 

U 

U 

Mod 

NEG 









Mod 




Mod 

Mod 

Mod 

Mod 

Mod 

OR 









0 




Mod 

Mod 

U 

Mod 

0 

OUT 








Tst 










OUTSx 








Tst 


Tst 








POPCNT 









0 




0 

Mod 

0 

0 

0 

POPFx 

Pop 

Tst 

Mod 

Pop 

Tst 

0 

Pop 

Tst 

Pop 

Pop 

Pop 

Pop 

Pop 

Pop 

Pop 

Pop 

Pop 

Pop 
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Table F-1. Instruction Effects on RFLAGS (continued) 


Instruction 

Mnemonic 

RFLAGS Mnemonic and Bit Number 

ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

RCL 1 









Mod 








Tst 

Mod 

RCL count 









U 








Tst 

Mod 

RCR 1 









Mod 








Tst 

Mod 

RCR count 









U 








Tst 

Mod 

ROL 1 









Mod 








Mod 

ROL count 









U 








Mod 

ROR 1 









Mod 








Mod 

ROR count 









U 








Mod 

RSM 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

SAHF 













AH 

AH 

AH 

AH 

AH 

SHL/SAL 1 









Mod 




Mod 

Mod 

U 

Mod 

Mod 

SHL/SAL count 









U 




Mod 

Mod 

U 

Mod 

Mod 

SAR 1 









Mod 




Mod 

Mod 

U 

Mod 

Mod 

SAR count 









U 




Mod 

Mod 

U 

Mod 

Mod 

SBB 









Mod 




Mod 

Mod 

Mod 

Mod 

Tst 

Mod 

SCASx 









Mod 

Tst 



Mod 

Mod 

Mod 

Mod 

Mod 

SETcc 









Tst 




Tst 

Tst 


Tst 

Tst 

SHLD 1 
SHRD 1 









Mod 




Mod 

Mod 

U 

Mod 

Mod 

SHLD count 
SHRD count 









U 




Mod 

Mod 

U 

Mod 

Mod 

SHR 1 









Mod 




Mod 

Mod 

U 

Mod 

Mod 

SHR count 









U 




Mod 

Mod 

U 

Mod 

Mod 

STC 

















1 

STD 










1 








STI 



Mod 





Tst 



Mod 







STOSx 










Tst 








SUB 









Mod 




Mod 

Mod 

Mod 

Mod 

Mod 

SYSCALL 

Mod 

Mod 

Mod 

Mod 

0 

0 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

SYSENTER 





0 

0 





0 







SYSRET 

Mod 

Mod 

Mod 

Mod 


0 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

Mod 

TEST 









0 




Mod 

Mod 

U 

Mod 

0 

UCOMISD 

UCOMISS 









0 




0 

Mod 

0 

Mod 

Mod 
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Table F-1. Instruction Effects on RFLAGS (continued) 


Instruction 

Mnemonic 

RFLAGS Mnemonic and Bit Number 

ID 

VIP 

VIF 

AC 

VM 

RF 

NT 

IOPL 

OF 

DF 

IF 

TF 

SF 

ZF 

AF 

PF 

CF 

21 

20 

19 

18 

17 

16 

14 

13:12 

11 

10 

9 

8 

7 

6 

4 

2 

0 

VERR 

VERW 














Mod 




XADD 









Mod 




Mod 

Mod 

Mod 

Mod 

Mod 

XOR 









0 




Mod 

Mod 

U 

Mod 

0 
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Numerics 

OF_38h opcode map. 467 
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16-bit mode. xxiii 

32-bit mode. xxiii 

64-bit mode. xxiii 

A 

AAA. 73 

AAD. 74 

AAM. 75 

AAS. 76 

ADC. 77 

ADD. 79,80 

address size prefix. 9, 25 

addressing 

byte registers. 26 

effective address. 495, 498, 499, 501 

PC-relative. 24 

RIP-relative. xxviii, 24 

AMD64 Instruction-set Architecture. 535 

AMD64 ISA. 535 

AND. 83 

ANDN. 85 

ARPL. 357 

B 

base field. 500, 501 

BEXTR (immediate form). 89 

BEXTR (register form). 87 

biased exponent. xxiv 

BLCFILL. 91 

BLCI. 93 

BLCIC. 95 

BLCMSK. 97 

BLCS. 99 

BLSFILL. 101 

BLSI. 103 

BLSIC. 105 

BLSMSK. 107 

BLSR. 109 

BOUND. Ill 

BSF. 113 

BSR. 114 

BSWAP. 115 


BT. 116 

BTC. 118 

BTR. 120 

BTS. 122 

byte register addressing. 26 

BZHI. 124 

c 

CALL. 15 

far call. 128 

near call. 126 

CBW. 135 

CDQ. 136 

CDQE. 135 

CLC. 137 

CLD. 138 

CLFLUSH. 139 

CLGI. 360 

CL1. 361 

CLTS. 363 

CMC. 146 

CMOVcc. 147, 462 

CMP. 151 

CMPSx. 154 

CMPXCHG. 156 

CMPXCHG16B. 158 

CMPXCHG8B. 158 

commit. xxiv 

compatibility mode. xxiv 

condition codes 

rFLAGS. 462,483 

count. 503 

CPU1D. 160 

extended functions. 160 

feature flags. 538 

standard functions. 160 

CPUID instruction 

testing for. 160 

CQO. 136 

CRC32. 162 

CWD. 136 

CWDE. 135 

D 

DAA. 164 

DAS. 165 

data types 

128-bit media. 44 
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64-bit media. 

general-purpose. 

x87. 

DEC. 

direct referencing. 

displacements. 

DIV. 

double quadword. 

doubleword. 

E 

eAX-eSP register. 

effective address. 

effective address size. 

effective operand size. 

eFLAGS register. 

elP register. 

element. 

endian order. 

ENTER. 

exceptions. 

exponent. 

F 

FCMOVcc. 

flush. 

G 

general-purpose registers.. 

H 

HLT. 

I 

IDIV. 

IGN. 

immediate operands. 

IMUL. 

IN. 

INC. 

index field. 

indirect. 

INSB. 

INSD. 

instruction opcode. 

instructions 

128-bit & 256-bit media 

64-bit media. 

effects on rFLAGS. 

encoding syntax. 

general-purpose. 

invalid in 64-bit mode... 


. 48 

. 40 

. 50 

16, 166, 531 

. xxiv 

. xxiv, 24 

. 168 

. xxiv 

. xxiv 


. xxx 

495, 498, 499, 501 

. XXV 

. XXV 

. xxx 

. xxx 

. XXV 

. xxxii, 4 

. 15, 170 

. xxv, 51 

. xxiv 


483 

xxv 


38 
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. 172 

. xxv 

. 24,503 

. 174 

. 176 

16, 178, 531 

. 501 

. xxv 

. 180 

. 180 

. 16 

. 541 

. 541 

. 641 

. 1 

. 71,541 

. 529 


invalid in long mode. 530 

reassigned in 64-bit mode. 530 

SSE. 541 

system. 355, 541 

x87. 541 

INSW. 180 

INSx. 180 

INT. 182 

INT 3. 365 

interrupt vectors. 51 

INTO. 189 

INVD. 368 

INVLPG. 369 

INVLPGA. 370 

IRET. 371 

IRETD. 371 

IRETQ. 371 

J 

Jcc. 15,190,462 

JCXZ. 194 

JECXZ. 194 

JMP. 15 

far jump. 197 

near jump. 195 

JRCXZ. 194 

JrCXZ. 15 

L 

LAHF. 202 

LAR. 377 

LDS. 203 

LEA. 205 

LEAVE. 15,207 

legacy mode. xxvi 

legacy x86. xxvi 

LES. 203 

LFENCE. 208 

LFS. 203 

LGDT. 15,379 

LGS. 203 

LIDT. 15,381 

LLDT. 15,383 

LLWPCB. 209 

LMSW. 385 

LOCK prefix. 11 

LODSB. 212 

LODSD. 212 

LODSQ. 212 

LODSW. 212 

LODSx. 212 

long mode. xxvi 
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LOOP. 

LOOPcc.. 
LOOPx... 

LSB. 

lsb. 

LSL. 

LSS. 

LTR. 

LWPINS . 
LWPVAL 
LZCNT... 

M 


. 15 

. 15 

... 214 
.. xxvi 
.. xxvi 
... 386 
... 203 
15, 388 
... 216 
... 218 
... 221 


mask. 

MBZ. 

MFENCE. 

mod field. 

mode-register-memory (ModRM) 

modes. 

16-bit. 

32-bit. 

64-bit. 

compatibility. 

legacy. 

long. 

protected. 

real. 

virtual-8086 . 

ModRM. 

ModRM byte. 

moffset. 

MONITOR. 

MOV. 

MOV CRn. 

MOV DRn. 

MOVBE. 

MOVD. 

MOVMSKPD. 

MOVMSKPS. 

MOVNTI. 

MOVS. 

MOVSX. 

MOVSx. 

MOVSXD. 

MOVZX. 

MSB. 

msb. 

MSR. 

MUL. 

multimedia instructions. 

MULX. 

MWAIT. 


. xxvi 

. xxvi 

. 223 

. 498 

. 494 

. 533 

. xxiii 

. xxiii 

. xxiii, 533 

. xxiv, 533 

. xxvi 

. xxvi, 533 

. xxvii 

. xxviii 

. xxix 

. 494 

17, 27, 463, 474, 494 

. xxvii 

. 390 

. 226 

. 15,392 

. 15,394 

. 229 

. 231 

. 235 

. 237 

. 239 

. 241 

. 243 

. 241 

. 244 

. 245 

. xxvii 

. xxvii 

. xxxi 

. 246 

. xxvii 

. 248 

. 396 


N 

NEG. 

NOP. 

NOT. 

notation. 

o 

octword. 

offset. 

one-byte opcodes. 

opcode. 

two-byte. 

opcode map 

0F_38h . 

0F_3Ah. 

primary. 

secondary. 

opcode maps. 

opcodes 

3DNow!™. 

group 1. 

group 10. 

group 11. 

group 12. 

group 13. 

group 14. 

group 16. 

group 17. 

group la. 

group 2. 

group 3. 

group 4. 

group 5. 

group 6. 

group 7. 

group 8. 

group 9. 

group P. 

groups. 

ModRM byte.... 

one-byte. 

x87 opcode map 
operands 

immediate. 

size. 

OR. 

OUT. 

OUTS. 

OUTSB. 

OUTSD. 

OUTSW. 

overflow. 


. 252 

254, 532 

. 255 

. 53 


. xxvii 

. xxvii, 24 

. 454 

. 16 

. 456 

. 467 

. 467 

. 454 

. 456 

. 454 

. 471 

. 463 

. 465 

. 465 

. 465 

. 465 

. 466 

. 466 

. 466 

. 464 

. 464 

. 464 

. 464 

. 464 

. 465 

. 465 

. 465 

. 465 

. 466 

. 463 

. 463 

. 454 

. 474 

. 24,503 

7, 503, 504, 530 

. 256 

. 259 

. 260 

. 260 

. 260 

. 260 

. xxvii 
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P 


packed. xxvii 

PAUSE. 262 

PC-relative addressing. 24 

PDEP. 263 

PEXT. 265 

POP. 267 

POPFS. 15 

POPGS. 15 

POPreg. 15 

POPreg/mem. 15 

POPAD. 269 

POPAx. 269 

POPCNT. 270 

POPF. 272 

POPFD. 272 

POPFQ. 15,272 

PREFETCH. 275 

PREFETCHlevel. 277 

PREFETCHW. 275 

prefix 

REX. 14 

prefixes 

address size. 9,25 

LOCK. 11 

operand size. 7 

repeat. 12 

REX. 25 

segment. 10 

primary opcode map. 454 

processor feature identification (rFLAGS.ID). 160 

processor vendor. 161 

protected mode. xxvii 

PUSH. 279 

PUSH FS. 15 

PUSH GS. 15 

PUSH imm32. 15 

PUSH imm8. 15 

PUSHreg. 15 

PUSH reg/mem. 15 

PUSHA. 281 

PUSHAD. 281 

PUSHF. 282 

PUSHFD. 282 

PUSHFQ. 15,282 

Q 

quadword. xxvii 

R 

r/m field. 463 


r8-rl5 


xxxi 


rAX-rSP. 

RAZ. 

RCL. 

RCR. 

RDFSBASE. 

RDGSBASE. 

RDMSR. 

RDPMC. 

RDRAND. 

RDTSC. 

RDTSCP. 

real address mode. See real mode 

real mode. 

reg field. 

registers 

eAX-eSP. 

eFLAGS. 

elP. 

encodings. 

general-purpose. 

MMX. 

r8-rl5. 

rAX-rSP. 

rFLAGS. 

rlP. 

segment. 

system. 

x87. 

XMM. 

relative. 

REPx prefixes. 

reserved. 

RET 

far return. 

near return. 

RET (Near). 

revision history. 

REX prefix. 

REX prefixe. 

REX prefixes. 

REX.B bit. 

REX.R bit. 

REX. W bit. 

REX.X bit. 

rFLAGS conditions codes. 

rFLAGS register. 

rlP register. 

RIP-relative addressing. 

ROL. 

ROR. 

RORX. 

rotate count. 


. xxxi 

. xxvii 

. 284 

. 286 

. 288,346 

. 288,346 

. 398 

. 399 

. 289,290 

. 401 

. 403 

. xxviii 

463, 495, 497, 498 

. XXX 

. XXX 

. XXX 

. 26 

. 38 

. 48 

. xxxi 

. xxxi 

xxxii, 462, 483, 641 

. xxxii 

. 40 

. 41 

. 50 

. 43 

. xxviii 

. 12 

. xxviii 

. 293 

. 292 

. 15 

. xvii 

. 14 

. 494 

. 25 

... 24,56,498,500 

. 23,497 

. 23 

. 23 

. 462,483 

. xxxii, 641 

. xxxii 

. xxviii, 24 

. 297 

. 299 

. 301 

. 503 
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RSM. 405 

RSM instruction. 405 

s 

SAHF. 303 

SAL. 304 

SAR. 307 

SARX. 309 

SBB. 311 

SBZ. xxviii 

scale field. 501 

scale-index-base (SIB). 494 

SCAS. 313 

SCASB. 313 

SCASD. 313 

SCASQ. 313 

SCASW. 313 

secondary opcode map. 456 

segment prefixes. 10,532 

segment registers. 40 

set. xxviii 

SETcc. 315,462 

SFENCE. 317 

SGDT. 407 

shift count. 503 

SHL. 304,318 

SHLD. 319 

SHLX. 321 

SHR. 323 

SHRD. 325 

SHRX. 327 

SIB. 494 

SIB byte. 19,27,499 

SIDT. 408 

SKIN1T. 409 

SLDT. 411 

SLWPCB. 329 

SMSW. 413 

SSE. xxviii 

SSE2. xxix 

SSE3. xxix 

STC. 331 

STD. 332 

STGI. 417 

STI. 415 

sticky bits. xxix 

STOS. 333 

STOSB. 333 

STOSD. 333 

STOSQ. 333 

STOSW. 333 


STR. 418 

SUB. 335 

SWAPGS. 419 

syntax. 52 

SYSCALL. 421 

SYSENTER. 425 

SYSEXIT. 427 

SYSRET. 429 

system data structures. 42 

T 

T1MSKC. 337 

TEST. 339 

three-byte prefix. 29 

TSS. xxix 

two-byte opcode. 456 

two-byte prefix. 32 

TZCNT. 341 

TZMSK. 343 

u 

UD2. 345 

underflow. xxix 

V 

vector. xxix 

VERR. 433 

VERW. 435 

VEX prefix. 494 

virtual-8086 mode. xxix 

VMLOAD. 436 

VMMCALL. 438 

VMRUN. 439 

VMSAVE. 444 

w 

WBINVD. 446 

WRMSR. 224,250,448 

X 

XADD. 347 

XCHG. 349 

XLAT. 351 

XLATB. 351 

XOP prefix. 494 

XOR. 352 

z 

zero-extension. 503 
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